RMLio / rmlmapper-java

The RMLMapper executes RML rules to generate high quality Linked Data from multiple originally (semi-)structured data sources
http://rml.io
MIT License
147 stars 61 forks source link

Library Usage: Dynamically set source of mappings #94

Closed aljoshakoecher closed 3 years ago

aljoshakoecher commented 3 years ago

Hi,

when using rml mapper as a library, is there a way to dynamically set the source (e.g. XML document that shall be mapped)? For my use case, it is kinda awkward to have the path of a source file coded into the mapping document. I have one mapping definition that should be used for various XML documents that all adhere to the same XML schema. So everything in the mapping definition is fixed, but I want to dynamically change the source depending on user input.

I would like to simply give users a jar with my mapping application that is basically just a wrapper of your rml mapper but with a fixed mapping document. My mapping document should ideally be inside the resources folder to simply deliver one "complete" jar to my users. XML documents that should be mapped are outside of the jar, so its a bit of a pain.

Any advice would be appreciated. And keep up the good work on RML, I really like it 👍

justin2004 commented 3 years ago

it is kinda awkward to have the path of a source file coded into the mapping document.

i just tested and it looks you can can use /dev/stdin (assuming you've got a linux distro).

e.g.

     rml:source "/dev/stdin";

then

cat src/test/resources/example5/museum.json | docker run --rm -i -v `pwd`:/data rmlmapper -v -m  src/test/resources/example5/museum-model.rml.ttl
justin2004 commented 3 years ago

oh and you can of course just iterate through files...

e.g. in bash:

ls -d1 *json | while read in ; do
cat $in | docker run --rm -i -v `pwd`:/data rmlmapper -v -m  src/test/resources/example5/museum-model.rml.ttl > ${in}.ttl
done

same could be done with xml

aljoshakoecher commented 3 years ago

That's an interesting solution, thanks! But unfortunately I don't want to rely on a linux specific solution. I had hoped for a way to set / change the source document after loading a mapping into a QuadStore. Wouldn't it be possible to remove the quad containing the source and add it again with a different object (i.e. another source document)? I just stumbled accross QuadStore.removeQuad(...) and QuadStore.addQuad(...)

justin2004 commented 3 years ago

gotcha. well you could load the triples (the mapping) in a jena model and do a insert/delete sparql query on the model https://stackoverflow.com/questions/53843724/sparql-insert-delete to replace the source file.

https://jena.apache.org/index.html

aljoshakoecher commented 3 years ago

Yes, I'm going to try this out. Thanks a lot!