SDM-TIB / SDM-RDFizer

An Efficient RML-Compliant Engine for Knowledge Graph Construction
https://doi.org/10.5281/zenodo.3872103
Apache License 2.0
108 stars 25 forks source link

Missing JSON keys throw error in objectMap #55

Closed mielvds closed 3 years ago

mielvds commented 3 years ago

Describe the bug When a field is missing in the JSON, the JSONPath does not resolve and the app crashes. Instead, this should not produce the triple.

TM: http://example.org/tm/json
Traceback (most recent call last):
  File "/Users/mielvandersande/Tools/SDM-RDFizer/rdfizer/run_rdfizer.py", line 3, in <module>
    semantify(str(sys.argv[1]))
  File "/Users/mielvandersande/Tools/SDM-RDFizer/rdfizer/rdfizer/semantify.py", line 4117, in semantify
    number_triple += executor.submit(semantify_file, triples_map, triples_map_list, ",",output_file_descriptor, wr, config[dataset_i]["name"], data).result()
  File "/Users/mielvandersande/.pyenv/versions/3.8.7/lib/python3.8/concurrent/futures/_base.py", line 432, in result
    return self.__get_result()
  File "/Users/mielvandersande/.pyenv/versions/3.8.7/lib/python3.8/concurrent/futures/_base.py", line 388, in __get_result
    raise self._exception
  File "/Users/mielvandersande/.pyenv/versions/3.8.7/lib/python3.8/concurrent/futures/thread.py", line 57, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/Users/mielvandersande/Tools/SDM-RDFizer/rdfizer/rdfizer/semantify.py", line 2341, in semantify_file
    object = "<" + string_substitution(predicate_object_map.object_map.value, "{(.+?)}", row, "object",ignore, triples_map.iterator) + ">"
  File "/Users/mielvandersande/Tools/SDM-RDFizer/rdfizer/rdfizer/functions.py", line 315, in string_substitution
    row = row[tp]
KeyError: 'manager'

To Reproduce

[
  {
    "id": "001",
    "name": "x",
    "manager": {
      "id": "a"
    },
    "custom_fields": [
      {
        "value": "OR-12345",
        "label": "OR-ID"
      }
    ]
  },
  {
    "id": "002",
    "name": "y",
    "manager": {
      "id": "b"
    },
    "custom_fields": [
      {
        "value": null,
        "label": "OR-ID"
      }
    ]
  },
  {
    "id": "003",
    "name": "z",
    "custom_fields": [
      {
        "value": "OR-1011",
        "label": "OR-ID"
      }
    ]
  }
]
@prefix rr:         <http://www.w3.org/ns/r2rml#> .
@prefix foaf:       <http://xmlns.com/foaf/0.1/> .
@prefix ex:         <http://example.com/> .
@prefix xsd:        <http://www.w3.org/2001/XMLSchema#> .
@prefix rml:        <http://semweb.mmlab.be/ns/rml#> .
@prefix ql:         <http://semweb.mmlab.be/ns/ql#> .
@prefix org:        <http://www.w3.org/ns/org#> .
@prefix skos:       <http://www.w3.org/2004/02/skos/core#> .
@prefix dc:         <http://purl.org/dc/terms/> .
@base               <http://example.org/tm/> .

<json>
  a rr:TriplesMap;

  rml:logicalSource [
    rml:source "data.json";
    rml:referenceFormulation ql:JSONPath;
    rml:iterator "$[*]"
  ];
  rr:predicateObjectMap [ 
    rr:predicate ex:property; 
    rr:objectMap [ rml:template "manager.id" ]
  ].
eiglesias34 commented 3 years ago

Hello,

I fixed the issue. I made sure that the verification work for rr:template and rml:reference. The mapping that was provided did not have a subject map and the object map was not correct syntactically. Please test it out and tell me if anything else needs to be fixed.

Thank you again for using the SDM-RDFizer

mielvds commented 3 years ago

thx @eiglesias34 . Indeed, I quickly deleted some stuff for readability and accidentally made the example invalid