Open tobiasschweizer opened 2 years ago
I noticed that CARML supports streams like
stdin
which is very useful because no fixed file name ends up in the mapping.Will there also be the possibility to provide additional "virtual" mappings for handling file names? From the docs I can see that it is possible to provide more than one mapping.
The main mapping would contain one or several logical sources without an
rml:source
and an additional mapping would provide therml:source
, see RMLio/rmlmapper-java#97 (comment).
Yes, this is possible. When multiple mapping files are provided, they are first combined into one Model and then mapped to mapping objects internally. So you could organize your files how you wish.
Would it also also possible to provide mappings inline, .e.g.
-m "<#myLogicalSource> rml:source \"file.json\" ."
?
We could make something like that possible sure. The above way is a bit tricky though. For instance, which RDF format do you support for specifying the extra triples? Only n-quads?
An other approach I was thinking of recently to achieve something similar would be to plug in a template engine with which to template the mapping files, and then providing the template mapping via a cli option.
For instance, which RDF format do you support for specifying the extra triples? Only n-quads?
Good question. For now, I have written my mappings in Turtle.
An other approach I was thinking of recently to achieve something similar would be to plug in a template engine with which to template the mapping files, and then providing the template mapping via a cli option.
I was actually thinking about using a template engine to generate the mapping files. So far, I have two mappings for two target types (schema:Book
and schema:ScholarlyArticle
) and I would like to reduce the two mappings to one template to avoid redundancy. Also templating might make the use of some RML functions unnecessary, e.g., depending on a property's value, a target type is chosen.
Hi there,
I am finally coming back to this :-)
I've tried keeping things in two different mapping files which works fine:
rml:source
, e.g. a file or the CARML stdin.I would like to further evaluate the two options discussed above:
I'd would like to explore option one for now as it seems quite straight forward. I'll have to think about the point you raised considering the RDF serialisation format.
I suppose I would need some guidance at some point. Would that be ok for you?
@tobiasschweizer Ok great. Thinking about this a bit more, I'm leaning more towards option 2 as a preference. Because:
The downside of course is that this is not standardized in any way.
@tobiasschweizer Ok great. Thinking about this a bit more, I'm leaning more towards option 2 as a preference. Because:
- if you have multiple places where you want to use the same value in your mapping, you could reuse a template variable.
- it is does not bind you to a specific syntax. For example, if we were to support YARRRML mappings, or any other non-RDF syntax, it could work with the same interface.
The downside of course is that this is not standardized in any way.
Yes, I see your point. Templates would be extremely helpful. However, just to get some grip on CARML Jar I'd like to try to figure out what I can do myself for option one.
I am on vacation next week but I'll be back the week after. Let me know if I can help with anything regarding the templates. Do you already have an engine in mind? I used to work with Twirl (Scala) for SPARQL queries. This worked quite well.
In any case, using a template engine could be thought of as a single, separate step before using the RML engine. So if this could be cleanly abstracted out maybe also other RML engines could be pick up?
Yes, I see your point. Templates would be extremely helpful. However, just to get some grip on CARML Jar I'd like to try to figure out what I can do myself for option one.
Cool!
I am on vacation next week but I'll be back the week after. Let me know if I can help with anything regarding the templates. Do you already have an engine in mind? I used to work with Twirl (Scala) for SPARQL queries. This worked quite well.
OK. I've used Pebble Templates in a couple of cases, and it works quite nicely, and is pretty customizable, yet simple to implement.
In any case, using a template engine could be thought of as a single, separate step before using the RML engine. So if this could be cleanly abstracted out maybe also other RML engines could be pick up?
Hmm that's itneresting, but possibly tricky. Would have to think about how best to do this.
In any case I would want the templating to be a separate module, also keeping the relevant option specification separate from the core stuff,
I am experimenting with a template engine in Python (jinja2) to generate several mappings for different providers from the same source. The mappings differ in terms of IRIs and it seems quite easy to do this with a template engine. I tried conditional subject maps in RML but only in YAML https://github.com/kg-construct/rml-questions/discussions/17 but I cannot recommend this approach.
Also template engines make it easy to keep consistent when avoiding logical joins since the same IRI has to be generated several times.
I would be happy to share my insights once I have finished the first iteration.
I noticed that CARML supports streams like
stdin
which is very useful because no fixed file name ends up in the mapping.Will there also be the possibility to provide additional "virtual" mappings for handling file names? From the docs I can see that it is possible to provide more than one mapping.
The main mapping would contain one or several logical sources without an
rml:source
and an additional mapping would provide therml:source
, see https://github.com/RMLio/rmlmapper-java/issues/97#issuecomment-781224985.Would it also also possible to provide mappings inline, .e.g.
-m "<#myLogicalSource> rml:source \"file.json\" ."
?