We store intermediary SPARQL queries when integrating data using the work-set clustering implementation. Those queries are generated based on a config and are executed.
SPARQL queries to update properties have a filename that includes the name of the data source as well as the name of the property.
However, we do not take the type of the entity into account. This can lead to an incomplete query log.
Imagine we update the property schema:name for KBR persons, the query is stored in property-update-query-KBR-schema_name.sparql and executed. In a later step of the pipeline we update the property schema:name for KBR organizations, the query is stored in property-update-query-KBR-schema_name.sparql and hence overwrites the previous query for persons.
Anyway, usually the data should be correct, because the query is executed immediately after the file is created.
We store intermediary SPARQL queries when integrating data using the work-set clustering implementation. Those queries are generated based on a config and are executed. SPARQL queries to update properties have a filename that includes the name of the data source as well as the name of the property.
However, we do not take the type of the entity into account. This can lead to an incomplete query log. Imagine we update the property
schema:name
for KBR persons, the query is stored inproperty-update-query-KBR-schema_name.sparql
and executed. In a later step of the pipeline we update the propertyschema:name
for KBR organizations, the query is stored inproperty-update-query-KBR-schema_name.sparql
and hence overwrites the previous query for persons.Anyway, usually the data should be correct, because the query is executed immediately after the file is created.