kbrbe / beltrans-data-integration

Creating a FAIR Linked Data corpus for the BELTRANS research project about Belgian book translations NL-FR and FR-NL between 1970 and 2020
https://www.kbr.be/en/projects/beltrans/
MIT License
5 stars 0 forks source link

Missing sourcePublisherIdentifier even though there is a sourceKBRIdentifier #272

Closed SvenLieber closed 3 months ago

SvenLieber commented 4 months ago

Probably in issue with the generated query or the data integration.

There seems to be an underlying issues, all contributors from the correlation list are treated as persons and as orgs due to an error in the integration script, see https://github.com/kbrbe/beltrans-data-integration/issues/273.

The corpus versions from March and April still had more-or-less one KBR publisher identifier for each record with a KBR source identifier. With the first few fixes of some of the commits below we could already get the number of source publisher identifiers for FR-NL translations from 1603 to 2121 and from NL-FR translations from 930 to 1063. However, we still have to reach 2121 and 2074 respectively.

image

SvenLieber commented 4 months ago

With the fixes above we reached

Not all of them cover BELTRANS genres, but nevertheless, for all of those records there should be a publisher since we have the KBR identifier, so there is a systematic issue (nothing to prioritize for BELTRANS genres). However, there can be cases where we do not have a KBR publisher in MARC field 710, but only as text in 264.