SDM-TIB / Dragoman

An Optimized Interpreter for RML Functional Mappings!
Apache License 2.0
6 stars 6 forks source link

The default example don't work #3

Open vemonet opened 3 years ago

vemonet commented 3 years ago

Hi @samiscoding , I tried to run the Dragoman but it crashes even with the default existing functions

There was no complete example given so I need to build mine by reverse engineering the existing code.

The example folder don't contains the most important information: the custom defined python functions (https://github.com/SDM-TIB/Dragoman/tree/master/example). The demo video don't show how to apply the functions in a RML file, it tells us where to write the python functions, but it don't show how to use those functions to actually run a mapping. The functions used in the example/test.ttl RML don't seems to be defined anywhere, so we need to figure out how to use functions by ourselves

It is not clear what is the full URI of the function that we define in the functions.py. That's the only thing we need to do normally: map a URI to a python function, because everything is a URI in RDF/RML, but there is no trace of defining URIs for functions in Dragoman. We can only define basic strings

In the example RML you used @prefix clarifyFun: <http://clarify2020.eu/function/> ., so I tried to use it also

You can find my repo here: https://github.com/vemonet/Dragoman/tree/vemonet-dev I added a setup.py (it is required to publish a pip package, not sure why you dont have a setup.py for this package that has been published), and I improved the Dockerfile to run it easily as a CLI instead of the API (the API just makes everything more complex to use and increase the chances of failures, which reduce the reproducibility of your tool)

I provide infos on how to build and run as comments in the Dockerfile: https://github.com/vemonet/Dragoman/blob/vemonet-dev/Dockerfile.cli (I can do a pull request if you are interested in adding)

I currently just try to use the default tolower function you already defined but it does not seems to work.

docker run -it --rm -v $(pwd)/example/hgnc:/data ghcr.io/vemonet/dragoman:latest -c config.ini  

Executing hgnc...
Traceback (most recent call last):
  File "/usr/local/bin/dragoman", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.8/site-packages/Interpreter/__main__.py", line 20, in main
    translate(config_path)
  File "/usr/local/lib/python3.8/site-packages/Interpreter/translate.py", line 791, in translate
    update_mapping(triples_map_list, function_dic, config["datasets"]["output_folder"], config[dataset_i]["mapping"],True,file_projection, config["datasets"]["strategy"])
  File "/usr/local/lib/python3.8/site-packages/Interpreter/connection.py", line 141, in update_mapping
    prefix, url, value = prefix_extraction(original, predicate_object.predicate_map.value)
  File "/usr/local/lib/python3.8/site-packages/Interpreter/connection.py", line 83, in prefix_extraction
    return prefixes[url], url, value
KeyError: 'http://www.w3.org/1999/02/22-rdf-syntax-ns#'

That would be really nice if we could have a full example that works (just git clone, run docker and it work), so that we can start from that to try it.

It would also be nice to have a full working example of piping Dragoman + RDFizer. Because RML mappings don't do much with functions, so any person doing mapping ro RDF seriously will need functions.

samiscoding commented 3 years ago

Hi @vemonet, thanks a lot for trying Dragoman and giving us feedback. Regarding the missing functions, I just noticed that it is in another branch and not pushed to the main branch, my bad! I am working on preparing a complete example and demo on using Dragoman, I will definitely inform you once it is ready :) We will also consider your other comments on docker. Also, we are welcoming any type of collaboration, so if you are interested we'd be happy to hear from you via email :)