gbv / k10plus-subjects

Subject analysis of records in K10plus catalogue
0 stars 0 forks source link

Move uniqueness contraint into preprocessing #7

Closed nichtich closed 2 years ago

nichtich commented 2 years ago

The database uniqueness constraint on (ppn,voc,notation) should be removed when cleanup (#2) is extended to make sure the same PPN is not indexed multiple times with the same voc & notation.

nichtich commented 2 years ago

Requires to extend this line: we can assume $voc.tsv is grouped by PPN but not stored by notation, e.g.

12345 XX DA3
12345 YY
12345 XX