SDM-TIB / SDM-RDFizer

An Efficient RML-Compliant Engine for Knowledge Graph Construction
https://doi.org/10.5281/zenodo.3872103
Apache License 2.0
107 stars 25 forks source link

Custom functions #68

Closed vemonet closed 2 years ago

vemonet commented 2 years ago

Hi, how is it possible to add custom RML functions to the RDFizer?

There is another issue about it but it only links to another repository (FunMap) which does not explain how to implement the code for a custom function, and then use it.

A custom function is a small function, in your case written in python, that I can call from the RML mapping

For example RocketRML explains it here: https://github.com/semantifyit/RocketRML#functions And RML mapper explains it here: https://github.com/RMLio/rmlmapper-java#including-functions

samiscoding commented 2 years ago

Hey @vemonet,

That is correct! We've been focusing on function interpretation as a separate tool. For this purpose, we have Dragoman which is designed specifically for interpreting mapping rules including functions (even more complex functions such as composite functions, calling an API, etc) that can be applied as part of any RML-based knowledge graph creation pipeline (independent of the RDFizing engine you use), and more importantly you can easily use your own library of functions with it. You can check this quick demo. Hope this helps! Free free to contact us for more information :)

vemonet commented 2 years ago

Thanks a lot for the pointers @samiscoding !

I am not sure to understand what you mean by using it as a separate tool

For example: normally I just write 1 mapping file with RML mappings, and some of those mappings are using functions. When I run the RML-engine the triples are generated, and when a function is used the value of a specific subject/predicate/object is preprocessed by a function (this is my understanding of the RML standards but I might be wrong!)

So this means that I need to execute RDFizer, then Dragoman? Or the other way around? How would it work if I want to use the RML-mapper-java implementation with Dragoman? Because the RML-mapper-java will fail if I don't implement the functions in Java.

Do you have a full example that people could reuse to make it easier to understand how to use your tool? You already built in the demo, would it be possible to commit in the example folder? There is a start with some CSV files and RML mappings, but not the python script, and it's not clear how to combine Dragoman with another RDFizer engine to finally generate RDF.

Ideally I think a lot (probably most) will expect to use Dragoman with your RDFizer, so it would be really helpful if we add a complete example to generate RML with RDFizer + Dragoman. Because I expect people will implement functions using the built-in functions system of RML-mapper-java or RocketRML

Later we can add example for RML-mapper-java and RocketRML, there are not that many RDFizer engine, so that should not be hard to provide the example to the users if the system actually works.

Sorry for all those questions, I already managed to implement custom functions for RocketRML in JavaScript, and RML-mapper in Java, but all RML-engine implementations are using completely different philosophies, without especially following the RML standard and they are not always thoroughly documented so it can be really hard as user to manage to use the different RML-engines

samiscoding commented 2 years ago

Once you have functions in your mappings, you run Dragoman “before” using any other engine such as SDM-RDFizer or rmlmapper; Dragoman will execute the functions inside the mappings and “transforms” your mappings and datasets into new data sets and function-free mapping rules. Then you can use any engine (no matter python or java) to create rdf from transformed datasets and mappings (as the normal case of having mappings without functions). An example already exists in the /example/ folder of Dragoman repository; the csv and mapping files are what the user provide to Dragoman (plus a config file as in config folder) and what you see in example/output is the output of Dragoman. As you see the functions are already materialized and the mappings are function-free. Hope this helps for now! Please give it a try and if you have further questions about Dragoman, do not hesitate to contact us through its github repository.