SDM-TIB / SDM-RDFizer

An Efficient RML-Compliant Engine for Knowledge Graph Construction
https://doi.org/10.5281/zenodo.3872103
Apache License 2.0
112 stars 25 forks source link

Generated SQL queries are not properly constructed #39

Closed arenas-guerrero-julian closed 3 years ago

arenas-guerrero-julian commented 3 years ago

Hi!!!

Describe the bug I have tested with MySQL and some SQL queries generated contain errors. I am using GTFS-Madrid-Bench as data source. Error:

Traceback (most recent call last):
  File "/home/julian/PycharmProjects/SDM-RDFizer/rdfizer/run_rdfizer.py", line 3, in <module>
    semantify(str(sys.argv[1]))
  File "/home/julian/PycharmProjects/SDM-RDFizer/rdfizer/rdfizer/semantify.py", line 3982, in semantify
    number_triple += executor.submit(semantify_mysql, row, row_headers, triples_map, triples_map_list, output_file_descriptor, wr, config[dataset_i]["name"], config[dataset_i]["host"], int(config[dataset_i]["port"]), config[dataset_i]["user"], config[dataset_i]["password"],config[dataset_i]["db"]).result()
  File "/usr/lib/python3.8/concurrent/futures/_base.py", line 439, in result
    return self.__get_result()
  File "/usr/lib/python3.8/concurrent/futures/_base.py", line 388, in __get_result
    raise self._exception
  File "/usr/lib/python3.8/concurrent/futures/thread.py", line 57, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/home/julian/PycharmProjects/SDM-RDFizer/rdfizer/rdfizer/semantify.py", line 2876, in semantify_mysql
    cursor.execute(query_new)
  File "/home/julian/PycharmProjects/SDM-RDFizer/venv/lib/python3.8/site-packages/mysql/connector/cursor.py", line 551, in execute
    self._handle_result(self._connection.cmd_query(stmt))
  File "/home/julian/PycharmProjects/SDM-RDFizer/venv/lib/python3.8/site-packages/mysql/connector/connection.py", line 490, in cmd_query
    result = self._handle_result(self._send_cmd(ServerCmd.QUERY, query))
  File "/home/julian/PycharmProjects/SDM-RDFizer/venv/lib/python3.8/site-packages/mysql/connector/connection.py", line 395, in _handle_result
    raise errors.get_exception(packet)
mysql.connector.errors.ProgrammingError: 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'DISTINCT `service_id`  FROM  gtfs.CALENDAR' at line 1

I have taken a look at the (full) SQL query that RDFizer generates and fails to execute:

SELECT DISTINCT `service_id`, `thursday`, `wednesday`, `sunday`, `tuesday`, `friday`, `saturday`, `start_date`, `monday`, `end_date`  ,  DISTINCT `service_id`  FROM  gtfs.CALENDAR;

Environment Running RDFizer Ubuntu 20.04 LTS using python 3.8 mysql:5.7 docker image

Julián

eiglesias34 commented 3 years ago

Hello,

Would you mind confirming for me if gtfs-rdb.rml.ttl is the mapping you are using?

arenas-guerrero-julian commented 3 years ago

Yes, I am using that mapping. You can find them in https://github.com/oeg-upm/gtfs-bench/blob/master/mappings/gtfs-rdb.rml.ttl

eiglesias34 commented 3 years ago

Hello,

I have updated the SDM-RDFizer. Please confirm if the error has been corrected, so we can close this issue.

arenas-guerrero-julian commented 3 years ago

Working (tested).

Julián