Closed SvenLieber closed 3 months ago
With the fixes above we reached
2,121
(instead of 1 ,603
) from 2,912
, thus 791 missing1,063
(instead of 930
) from 2,074
, thus 1,011 missingNot all of them cover BELTRANS genres, but nevertheless, for all of those records there should be a publisher since we have the KBR identifier, so there is a systematic issue (nothing to prioritize for BELTRANS genres).
However, there can be cases where we do not have a KBR publisher in MARC field 710
, but only as text in 264
.
Probably in issue with the generated query or the data integration.
There seems to be an underlying issues, all contributors from the correlation list are treated as persons and as orgs due to an error in the integration script, see https://github.com/kbrbe/beltrans-data-integration/issues/273.
The corpus versions from March and April still had more-or-less one KBR publisher identifier for each record with a KBR source identifier. With the first few fixes of some of the commits below we could already get the number of source publisher identifiers for FR-NL translations from 1603 to 2121 and from NL-FR translations from 930 to 1063. However, we still have to reach 2121 and 2074 respectively.