kg-construct / mapping-challenges

Issues for discussion about limitations of current mapping languages
Apache License 2.0
4 stars 9 forks source link

Iteration and reference 'identifiers' #43

Open bjdmeest opened 1 year ago

bjdmeest commented 1 year ago

Do we need to access iteration and reference identifiers from within the RML mapping?

(distancing this from https://github.com/kg-construct/mapping-challenges/issues/6, as they can be about different things)

Use cases:

CSVW has a similar approach: https://w3c.github.io/csvw/metadata/#uri-template-properties

bjdmeest commented 1 year ago

copied over from (https://github.com/kg-construct/mapping-challenges/issues/6#issuecomment-1423978948) to make this thread more self-standing

Additional use case in favor of 'iteration identifiers': being able to get the 'accessing' reference formulation per reference is needed at https://github.com/RMLio/yarrrml-parser/issues/184.

So, I can imagine for CSV, per iteration you need to be able to identify that iteration (in this case, the row index would be enough), and per reference you need to identify that reference formulation (in this case, the combination row index / column index would be enough).

So a CSV file like below

lastname firstname
De Meester Ben
Chaves David

Could have a CSV iteration like below

firstname
David

Could actually have following references (relying on https://w3c.github.io/csvw/metadata/#uri-template-properties)

_sourceRow firstname firstname_sourceColumn
2 David 1

For JSONPath, you could include the actual used path for each iteration and reference e.g. iteration $.persons[*] with reference * would give, for the first iteration, identifier $.persons[0] and reference identifiers lastname and firstname.

It won't create the most elegant mappings, but gives a lot of context for users to hack stuff together