SDM-TIB / SDM-RDFizer

An Efficient RML-Compliant Engine for Knowledge Graph Construction
https://doi.org/10.5281/zenodo.3872103
Apache License 2.0
107 stars 25 forks source link

SDM-RDFizer crashes when parsing mapping #94

Closed DylanVanAssche closed 1 year ago

DylanVanAssche commented 1 year ago

Describe the bug

When parsing the mapping for MySQL or PostgreSQL, the SDM-RDFizer crashes:

MySQL crash:

Semantifying out...
TM: http://ex.com/#TriplesMap1
Traceback (most recent call last):
File "//sdm-rdfizer/rdfizer/run_rdfizer.py", line 3, in <module>
semantify(str(sys.argv[1]))
File "/sdm-rdfizer/rdfizer/rdfizer/semantify.py", line 4492, in semantify
number_triple += executor.submit(semantify_mysql, row, row_headers, triples_map, triples_map_list, output_file_descriptor, config[dataset_i]["host"], int(config[dataset_i]["port"]), config[dataset_i]["user"], config[dataset_i]["password"],config[dataset_i]["db"]).result()
File "/usr/lib/python3.10/concurrent/futures/_base.py", line 458, in result
return self.__get_result()
File "/usr/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
raise self._exception
File "/usr/lib/python3.10/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
File "/sdm-rdfizer/rdfizer/rdfizer/semantify.py", line 3107, in semantify_mysql
object_list = jt[row[row_headers.index(predicate_object_map.object_map.child[0])]]
KeyError: '2'

or crash for postgreSQL:

Traceback (most recent call last):
File "//sdm-rdfizer/rdfizer/run_rdfizer.py", line 3, in <module>
semantify(str(sys.argv[1]))
File "/sdm-rdfizer/rdfizer/rdfizer/semantify.py", line 4514, in semantify
Semantifying out...
TM: http://ex.com/#TriplesMap1
number_triple += executor.submit(semantify_postgres, row, row_headers, triples_map, triples_map_list, output_file_descriptor,config[dataset_i]["user"], config[dataset_i]["password"], config[dataset_i]["db"], config[dataset_i]["host"]).result()
File "/usr/lib/python3.10/concurrent/futures/_base.py", line 458, in result
return self.__get_result()
File "/usr/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
raise self._exception
File "/usr/lib/python3.10/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
File "/sdm-rdfizer/rdfizer/rdfizer/semantify.py", line 3774, in semantify_postgres
object_list = jt[row[row_headers.index(predicate_object_map.object_map.child[0])]]
KeyError: '2'

Mapping:

@base <http://ex.com/> .
@prefix ex: <http://example.com/> .
@prefix rr: <http://www.w3.org/ns/r2rml#> .

<#TriplesMap1> a rr:TriplesMap ;
    rr:logicalTable [ rr:tableName "data1" ] ;
    rr:predicateObjectMap [ a rr:PredicateObjectMap ;
            rr:objectMap [ a rr:ReferenceObjectMap ;
                    rr:joinCondition [ a rr:JoinCondition ;
                            rr:child "id" ;
                            rr:parent "id" ] ;
                    rr:parentTriplesMap <#TriplesMap2> ] ;
            rr:predicateMap [ a rr:PredicateMap ;
                    rr:constant ex:j1 ] ] ;
    rr:subjectMap [ rr:template "http://ex.com/table1/{id}" ] .

<#TriplesMap2> a rr:TriplesMap ;
    rr:logicalTable [ rr:tableName "data2" ] ;
    rr:subjectMap [ rr:template "http://ex.com/table2/{id}" ] .

To Reproduce Steps to reproduce the behavior (and resources):

  1. Run the SDM-RDFizer with the mapping
  2. Crash

Expected behavior Mapping is parsed and executed.

Desktop (please complete the following information):

eiglesias34 commented 1 year ago

Hi @DylanVanAssche,

Never good news with you, jajajaja.

I notice that the mappings you are using have logicalTable, not logicalSource. Is that correct? If it is, the SDM-RDFizer only parses RML, not R2RML. If it should be logicalSource I'll get right on fixing the problem.

Sincerely, Enrique

DylanVanAssche commented 1 year ago

Hi Enrique!

Hahahaha, apologies but I report everything I find :joy: !

I notice that the mappings you are using have logicalTable, not logicalSource. Is that correct? If it is, the SDM-RDFizer only parses RML, not R2RML. If it should be logicalSource I'll get right on fixing the problem.

Oh right,here are the 'right' ones, I always forget that the SDM-RDFizer does not handle R2RML, I have a script locally that transforms the R2RML files to RML for the SDM-RDFizer transparantly :P

Here's the transformed one:

@prefix d2rq: <http://www.wiwiss.fu-berlin.de/suhl/bizer/D2RQ/0.1#> .
@prefix ex: <http://example.com/> .
@prefix rml: <http://semweb.mmlab.be/ns/rml#> .
@prefix rr: <http://www.w3.org/ns/r2rml#> .

<http://ex.com/#TriplesMap1> a rr:TriplesMap ;
    rml:logicalSource [ a rml:LogicalSource ;
            rml:source [ a d2rq:Database ;
                    d2rq:jdbcDSN "jdbc:postgresql://PostgreSQL:5432/db" ;
                    d2rq:jdbcDriver "jdbc:postgresql" ;
                    d2rq:password "root" ;
                    d2rq:username "root" ] ;
            rr:sqlVersion rr:SQL2008 ;
            rr:tableName "data1" ] ;
    rr:predicateObjectMap [ a rr:PredicateObjectMap ;
            rr:objectMap [ a rr:ReferenceObjectMap ;
                    rr:joinCondition [ a rr:JoinCondition ;
                            rr:child "id" ;
                            rr:parent "id" ] ;
                    rr:parentTriplesMap <http://ex.com/#TriplesMap2> ] ;
            rr:predicateMap [ a rr:PredicateMap ;
                    rr:constant ex:j1 ] ] ;
    rr:subjectMap [ rr:template "http://ex.com/table1/{id}" ] .

<http://ex.com/#TriplesMap2> a rr:TriplesMap ;
    rml:logicalSource [ a rml:LogicalSource ;
            rml:source [ a d2rq:Database ;
                    d2rq:jdbcDSN "jdbc:postgresql://PostgreSQL:5432/db" ;
                    d2rq:jdbcDriver "jdbc:postgresql" ;
                    d2rq:password "root" ;
                    d2rq:username "root" ] ;
            rr:sqlVersion rr:SQL2008 ;
            rr:tableName "data2" ] ;
    rr:subjectMap [ rr:template "http://ex.com/table2/{id}" ] .
eiglesias34 commented 1 year ago

Hello again,

I'm quite happy with what you have been reporting, anything that proves beneficial for the SDM-RDFizer is welcome. I'll get right on fixing the problem.

eiglesias34 commented 1 year ago

Hello again,

Can you share with me the data you are using? Using the mapping you gave me with my data does not reproduce the error.

DylanVanAssche commented 1 year ago

I'm quite happy with what you have been reporting, anything that proves beneficial for the SDM-RDFizer is welcome. I'll get right on fixing the problem.

example.zip

I'm quite happy with what you have been reporting, anything that proves beneficial for the SDM-RDFizer is welcome. I'll get right on fixing the problem.

Thanks a lot!

eiglesias34 commented 1 year ago

Hey @DylanVanAssche,

I found the issue with the rdfizer. The issue wasn't with the parsing but a small detail with the join. The release I made solves the problem with MySQL and Postgress.

Thank you again, Enrique

DylanVanAssche commented 1 year ago

It looks like it works again, will report new issues if they appear, thanks!