Closed iherman closed 9 years ago
Discussed in meeting 12-Nov-2014 (Minutes); summary ...
Issue not yet closed as the mapping docs are not yet updated.
I think a checker might issue warnings when lexical values do not match that expected, but I think the a processor should emit all the data that it can; this allows downstream tools to make use of it. My own processors typically implement a validate
option, which would create an exception if invalid data is encountered, but this defaults to off.
This is closely related to issue #54 (it is, essentially, the same problem!). Just making the link, and add the "metadata vocabulary document" label to this
I think this is clear now in the metadata document (http://w3c.github.io/csvw/metadata/#parsing-cells). The algorithm adds validation errors to the model, and retains the original values as strings if it can't parse them. That gives maximum flexibility to the converters to either ignore or emit the errors, and to either ignore or emit the string values or invalid values.
I am fine with this.
+1
csv2rdf and csv2json docs now explicitly state that the triples or JSON output is not checked. Cell values are parsed upstream of the conversion procedure; errors might be reported & it is up to conversion applications to decide what to do in the case errors are present.
A high level issue is whether the transformation should check the values for proper content or not. Various situations may arise like invalid URIs generated by a template or an invalid lexical form for a specified datatypes. There seems to be two possibilities
In general, option two is probably a cleaner solution. However, in some cases, the transformation is expected to make transformation (e.g., generating an ISO formatted date value for the datetime and related datatypes) when some errors may be detected...