RMLio / rmlmapper-java

The RMLMapper executes RML rules to generate high quality Linked Data from multiple originally (semi-)structured data sources
http://rml.io
MIT License
146 stars 61 forks source link

Performance issue on large csv inputs #168

Closed SteBiard closed 11 months ago

SteBiard commented 2 years ago

Hello,

Thanks a lot for this great tool.

I am trying it on large datasets and even sampling with 5000 row each and a relatively normal mapping (20 data properties, 5 classes, 5 entity linking), I am quickly coming to a point where it runs over 10 mins.

Is there something that I am missing in use of the yarrrml parser + rmlmapper?

If not, I would find it interesting to have a GPU accelerated version of the mapper or recommendationon CPU config sizing based on mapping size and sources dimension (even roughly) if someone has such a study.

Thanks.

andimou commented 2 years ago

@SteBiard can you share your data and mappings to have a look?

DylanVanAssche commented 11 months ago

No response for a long time, closing. Re-open if needed