Incomplete query log for data integration queries

We store intermediary SPARQL queries when integrating data using the work-set clustering implementation. Those queries are generated based on a config and are executed. SPARQL queries to update properties have a filename that includes the name of the data source as well as the name of the property.

However, we do not take the type of the entity into account. This can lead to an incomplete query log. Imagine we update the property schema:name for KBR persons, the query is stored in property-update-query-KBR-schema_name.sparql and executed. In a later step of the pipeline we update the property schema:name for KBR organizations, the query is stored in property-update-query-KBR-schema_name.sparql and hence overwrites the previous query for persons.

Anyway, usually the data should be correct, because the query is executed immediately after the file is created.

kbrbe / beltrans-data-integration

Incomplete query log for data integration queries #265