w3c / csvw

Documents produced by the CSV on the Web Working Group
Other
161 stars 57 forks source link

allow a column to be the value for multiple triples (future feature) #878

Open baskaufs opened 3 years ago

baskaufs commented 3 years ago

When trying to use the Generating RDF from Tabular Data on the Web Recommendation to map a CSV to the Wikibase data model, I discovered that a column cannot be used as the object of more than one triple. The Wikibase model defines multiple paths from an item to a value. For example, there is a direct property path from an item to a value, e.g.:

wd:item wdt:prop wd:value.

and a two-property path through a statement node:

wd:item p:prop wds:statement.
wds:statement ps:prop wd:value.

If a column contains the local name of the wd:value URI, it can't be used to construct a value URI for two triples, even if one of them is a virtual column.

I realize that this may involve a level of complexity that goes beyond what the csv2rdf Recommendation is trying to accomplish, but in this real use case, it is not possible to create a metadata description file that will emit all of the triples necessary to conform to the Wikibase model. Instead, it required me to emit only the triples for the two-property path, then as a separate step use SPARQL construct to generate the single property path triple after loading the data into a triplestore.

I'm not sure what the solution would be for this problem. For triples with URI values, allowing the user to create additional virtual columns with the column variable used in a URI template would solve the problem. But since triples with literal values do not have any key:value pair in the column description that corresponds to valueUrl, there wouldn't seem to be any simple solution for them.

gkellogg commented 3 years ago

We have the notion of a virtual column, which is what allows a join model between different entities described on the same row. Perhaps an extension of the virtual column concept could be added to act as an alias to the value of another column.

Thanks for giving some practical feedback on the spec, and hopefully we'll see enough interest to revisit in a future Working Group. Of course, the Community Group could, with sufficient participation, come up with their own followup spec, much as the JSON-LD CG did which as the basis for the JSON-LD 1.1 spec.