Swirrl / csv2rdf

Clojure library and command line application for converting CSV to RDF. An implementation of the W3C CSVW specifications
Eclipse Public License 1.0
25 stars 6 forks source link

Validation error when adding custom @types #191

Closed RickMoynihan closed 9 months ago

RickMoynihan commented 2 years ago

I think you should be allowed to annotate parts of the CSVW metadata document with custom types.

{
 "@context": "http://www.w3.org/ns/csvw",
 "url": "error.csv",
 "tableSchema": {"@id": "http://myschema",
                 "@type": "http://example.org/CustomTableType",
                 "rdfs:label": "my schema",
                 "columns": [{"@id": "http://myschema#column-odd",
                              "@type": "http://example.org/CustomColumnType",
                              "rdfs:label": "Odd Column Definition",
                              "name": "odd",
                              "datatype": "integer"
                              },
                             {
                              "@id": "http://myschema#column-even",
                              "rdfs:label": "Even Column Definition",
                              "name": "even",
                              "datatype": "integer",
                              }
                             ]
                 }
 }

I think this should be valid; and am assuming nothing in the spec prohibits adding your own types. RDF is open-world after all, i.e. the interpretation of the above schema file would be that the tableSchema is both a csvw:TableSchema and an ex:CustomTableType.

However if you do this you get errors like below; it would be very handy if we didn't.

$ java -jar csv2rdf.jar -u 'https://gist.githubusercontent.com/RickMoynihan/6547c59890b21da9ea6010f98615681f/raw/e996ddd1a852878cf1611da0726d8b6fcea9185a/error.csv-metadata.json'
#error {
 :cause Error at path ["tableSchema" "columns" 0 "@type"]: Expected type to be 'Column', received 'http://example.org/CustomColumnType'
 :data {:type :bad-metadata, :path [tableSchema columns 0 @type]}
 :via
 [{:type clojure.lang.ExceptionInfo
   :message Error at path ["tableSchema" "columns" 0 "@type"]: Expected type to be 'Column', received 'http://example.org/CustomColumnType'
   :data {:type :bad-metadata, :path [tableSchema columns 0 @type]}
   :at [clojure.core$ex_info invokeStatic core.clj 4739]}]
 :trace
 [[clojure.core$ex_info invokeStatic core.clj 4739]
  [clojure.core$ex_info invoke core.clj 4739]
  [csv2rdf.metadata.validator$make_error invokeStatic validator.clj 17]
  [csv2rdf.metadata.validator$make_error invoke validator.clj 16]
  [csv2rdf.metadata.validator$type_eq$fn__1339 invoke validator.clj 77]
  [csv2rdf.metadata.validator$optional_key$fn__1370 invoke validator.clj 166]
  [csv2rdf.metadata.validator$kvps$fn__1375$fn__1376 invoke validator.clj 184]
  [clojure.core$map$fn__5587 invoke core.clj 2747]
  [clojure.lang.LazySeq sval LazySeq.java 40]
  [clojure.lang.LazySeq seq LazySeq.java 49]
  [clojure.lang.RT seq RT.java 528]
  [clojure.core$seq__5124 invokeStatic core.clj 137]
  [clojure.core$filter$fn__5614 invoke core.clj 2801]
  [clojure.lang.LazySeq sval LazySeq.java 40]
  [clojure.lang.LazySeq seq LazySeq.java 49]
  [clojure.lang.Cons next Cons.java 39]
  [clojure.lang.RT next RT.java 706]
  [clojure.core$next__5108 invokeStatic core.clj 64]
  [clojure.core.protocols$fn__7852 invokeStatic protocols.clj 169]
  [clojure.core.protocols$fn__7852 invoke protocols.clj 124]
  [clojure.core.protocols$fn__7807$G__7802__7816 invoke protocols.clj 19]
  [clojure.core.protocols$seq_reduce invokeStatic protocols.clj 31]
  [clojure.core.protocols$fn__7835 invokeStatic protocols.clj 75]
  [clojure.core.protocols$fn__7835 invoke protocols.clj 75]
  [clojure.core.protocols$fn__7781$G__7776__7794 invoke protocols.clj 13]
  [clojure.core$reduce invokeStatic core.clj 6748]
  [clojure.core$into invokeStatic core.clj 6815]
  [clojure.core$into invoke core.clj 6807]
  [csv2rdf.metadata.validator$kvps$fn__1375 invoke validator.clj 186]
  [csv2rdf.metadata.types$validate_object_of$fn__1710 invoke types.clj 293]
  [csv2rdf.metadata.validator$variant$v__1349 invoke validator.clj 120]
  [csv2rdf.metadata.validator$variant$v__1349 invoke validator.clj 117]
  [csv2rdf.metadata.validator$array_of$fn__1353$fn__1354 invoke validator.clj 130]
  [clojure.core$map_indexed$mapi__8189$fn__8190 invoke core.clj 7228]
  [clojure.lang.LazySeq sval LazySeq.java 40]
  [clojure.lang.LazySeq seq LazySeq.java 49]
  [clojure.lang.RT seq RT.java 528]
  [clojure.core$seq__5124 invokeStatic core.clj 137]
  [clojure.core$filter$fn__5614 invoke core.clj 2801]
  [clojure.lang.LazySeq sval LazySeq.java 40]
  [clojure.lang.LazySeq seq LazySeq.java 49]
  [clojure.lang.RT seq RT.java 528]
  [clojure.lang.LazilyPersistentVector create LazilyPersistentVector.java 44]
  [clojure.core$vec invokeStatic core.clj 377]
  [clojure.core$vec invoke core.clj 367]
  [csv2rdf.metadata.validator$array_of$fn__1353 invoke validator.clj 129]
  [csv2rdf.metadata.column$columns invokeStatic column.clj 98]
  [csv2rdf.metadata.column$columns invoke column.clj 97]
  [csv2rdf.metadata.validator$optional_key$fn__1370 invoke validator.clj 166]
  [csv2rdf.metadata.validator$kvps$fn__1375$fn__1376 invoke validator.clj 184]
  [clojure.core$map$fn__5587 invoke core.clj 2747]
  [clojure.lang.LazySeq sval LazySeq.java 40]
  [clojure.lang.LazySeq seq LazySeq.java 49]
  [clojure.lang.RT seq RT.java 528]
  [clojure.core$seq__5124 invokeStatic core.clj 137]
  [clojure.core$filter$fn__5614 invoke core.clj 2801]
  [clojure.lang.LazySeq sval LazySeq.java 40]
  [clojure.lang.LazySeq seq LazySeq.java 49]
  [clojure.lang.Cons next Cons.java 39]
  [clojure.lang.RT next RT.java 706]
  [clojure.core$next__5108 invokeStatic core.clj 64]
  [clojure.core.protocols$fn__7852 invokeStatic protocols.clj 169]
  [clojure.core.protocols$fn__7852 invoke protocols.clj 124]
  [clojure.core.protocols$fn__7807$G__7802__7816 invoke protocols.clj 19]
  [clojure.core.protocols$seq_reduce invokeStatic protocols.clj 31]
  [clojure.core.protocols$fn__7835 invokeStatic protocols.clj 75]
  [clojure.core.protocols$fn__7835 invoke protocols.clj 75]
  [clojure.core.protocols$fn__7781$G__7776__7794 invoke protocols.clj 13]
  [clojure.core$reduce invokeStatic core.clj 6748]
  [clojure.core$into invokeStatic core.clj 6815]
  [clojure.core$into invoke core.clj 6807]
  [csv2rdf.metadata.validator$kvps$fn__1375 invoke validator.clj 186]
  [csv2rdf.metadata.types$validate_object_of$fn__1710 invoke types.clj 293]
  [csv2rdf.metadata.validator$variant$v__1349 invoke validator.clj 120]
  [csv2rdf.metadata.validator$variant$v__1349 invoke validator.clj 117]
  [csv2rdf.metadata.schema$schema invokeStatic schema.clj 98]
  [csv2rdf.metadata.schema$schema invoke schema.clj 97]
  [csv2rdf.metadata.validator$variant$v__1349 invoke validator.clj 120]
  [csv2rdf.metadata.validator$variant$v__1349 invoke validator.clj 117]
  [csv2rdf.metadata.validator$optional_key$fn__1370 invoke validator.clj 166]
  [csv2rdf.metadata.validator$kvps$fn__1375$fn__1376 invoke validator.clj 184]
  [clojure.core$map$fn__5587 invoke core.clj 2747]
  [clojure.lang.LazySeq sval LazySeq.java 40]
  [clojure.lang.LazySeq seq LazySeq.java 49]
  [clojure.lang.RT seq RT.java 528]
  [clojure.core$seq__5124 invokeStatic core.clj 137]
  [clojure.core$filter$fn__5614 invoke core.clj 2801]
  [clojure.lang.LazySeq sval LazySeq.java 40]
  [clojure.lang.LazySeq seq LazySeq.java 49]
  [clojure.lang.Cons next Cons.java 39]
  [clojure.lang.RT next RT.java 706]
  [clojure.core$next__5108 invokeStatic core.clj 64]
  [clojure.core.protocols$fn__7852 invokeStatic protocols.clj 169]
  [clojure.core.protocols$fn__7852 invoke protocols.clj 124]
  [clojure.core.protocols$fn__7807$G__7802__7816 invoke protocols.clj 19]
  [clojure.core.protocols$seq_reduce invokeStatic protocols.clj 31]
  [clojure.core.protocols$fn__7835 invokeStatic protocols.clj 75]
  [clojure.core.protocols$fn__7835 invoke protocols.clj 75]
  [clojure.core.protocols$fn__7781$G__7776__7794 invoke protocols.clj 13]
  [clojure.core$reduce invokeStatic core.clj 6748]
  [clojure.core$into invokeStatic core.clj 6815]
  [clojure.core$into invoke core.clj 6807]
  [csv2rdf.metadata.validator$kvps$fn__1375 invoke validator.clj 186]
  [csv2rdf.metadata.types$validate_object_of$fn__1710 invoke types.clj 293]
  [csv2rdf.metadata.validator$variant$v__1349 invoke validator.clj 120]
  [csv2rdf.metadata.validator$variant$v__1349 invoke validator.clj 117]
  [csv2rdf.metadata.types$validate_contextual_object$fn__1737 invoke types.clj 369]
  [csv2rdf.metadata.validator$variant$v__1349 invoke validator.clj 120]
  [csv2rdf.metadata.validator$variant$v__1349 invoke validator.clj 117]
  [csv2rdf.metadata.table$parse_table_json invokeStatic table.clj 41]
  [csv2rdf.metadata.table$parse_table_json invoke table.clj 40]
  [csv2rdf.metadata$parse_metadata_json invokeStatic metadata.clj 17]
  [csv2rdf.metadata$parse_metadata_json invoke metadata.clj 10]
  [csv2rdf.metadata$parse_table_group_from_source invokeStatic metadata.clj 23]
  [csv2rdf.metadata$parse_table_group_from_source invoke metadata.clj 21]
  [csv2rdf.tabular.processing$from_metadata_source invokeStatic processing.clj 21]
  [csv2rdf.tabular.processing$from_metadata_source invoke processing.clj 20]
  [csv2rdf.tabular.processing$get_metadata invokeStatic processing.clj 36]
  [csv2rdf.tabular.processing$get_metadata invoke processing.clj 27]
  [csv2rdf.csvw$csv__GT_rdf invokeStatic csvw.clj 32]
  [csv2rdf.csvw$csv__GT_rdf invoke csvw.clj 23]
  [csv2rdf.csvw$csv__GT_rdf__GT_destination invokeStatic csvw.clj 46]
  [csv2rdf.csvw$csv__GT_rdf__GT_destination invoke csvw.clj 41]
  [csv2rdf.main$write_output invokeStatic main.clj 56]
  [csv2rdf.main$write_output invoke main.clj 54]
  [csv2rdf.main$inner_main invokeStatic main.clj 83]
  [csv2rdf.main$inner_main invoke main.clj 72]
  [csv2rdf.main$_main invokeStatic main.clj 87]
  [csv2rdf.main$_main doInvoke main.clj 85]
  [clojure.lang.RestFn applyTo RestFn.java 137]
  [csv2rdf.main main nil -1]]}
Robsteranium commented 2 years ago

Seems reasonable but isn't the table schema a private interface for CSVW itself? If we want to add types, shouldn't that take place within the data itself?

Do you have an example use case?

RickMoynihan commented 2 years ago

Seems reasonable but isn't the table schema a private interface for CSVW itself?

What gives you that idea? I've seen no mention of this anywhere; the csvm files are a subset of json-ld.

If we want to add types, shouldn't that take place within the data itself?

I should probably have said that this may need to occur in our annotated mode, not the standard mode -- but regardless standard mode shouldn't choke on things it doesn't understand (i.e. it should be a tollerant reader :-) )

lkitching commented 2 years ago

It's my understanding the spec prohibits this - see e.g. the metadata spec for schemas:

If included, @type is an atomic property that MUST be set to "Schema". Publishers MAY include this to provide additional information to JSON-LD based toolchains.

Robsteranium commented 2 years ago

Yeah exactly, these are csvw:Columns - their purpose is to annotate the table/ provide an RDF translation.

What sort of types would you want to add?

RickMoynihan commented 9 months ago

Closed because spec doesn't permit this.