Closed SvenLieber closed 9 months ago
The software integration test added with the commit above verifies that the translation correlation list approach works if data is correctly added.
However, in the most recent data integration the correlation list entries were added to the target graph, but the automatic data integration did not take it into account.
There is some corrupted data, as seen in the screenshot below, the KBR identifier is taken for all bf:identifiedBy
relationships, also the rdf:value
of the linked entity has the KBR identifier for BnF, KB and Unesco.
This could explain why even though the correlation list entry was added, the automatic integration still added local records for them: there was no link and hence it could not be detected that there exists already a record from the correlation list.
There are several issues:
authority
instead of manifestation
)the last 4 rows of this snippet extract the column targetKBRIdentifier
, but it should also extract the targetBnFIdentifier
, targetKBIdentifier
and targetUnescoIdentifier
. Hence the files that should contain the related BnF, KB and Unesco identifiers all use the KBR identifier.
Additionally, the translation correlation list contained some columns multiple times, e.g. the column targetKBIdentifier
existed one time with values and another time without values, the latter was taken for the mapping and hence there were not KB identifiers.
There are still two issues:
For some translations we want to prioritize manual curated data, similar as for contributors (https://github.com/kbrbe/beltrans-data-integration/issues/176).
However, for translations we want a more inclusive approach as for contributors. Meaning that we only provide some basic information in the correlation list for translations and add more information from the data sources via the
schema:sameAs
link as necessary. For contributors the approach is exclusive, only the values provided in the correlation list are used. But in both cases the curation list entries should be excluded from the automatic integration!