Closed osma closed 6 years ago
Possibly related to empty ISSN keys #80
Would probably need to plot this using something like Gephi, to find out what's the problematic work key (or several) that is pulling all of these together.
It's not a single key that's the culprit, though there are problematic ones such as julkaisut
which probably should be blacklisted. The issue seems to be similar to #70: when a record has multiple series statements, being part of different series, somehow the keys get mixed up so that the title-based and ISSN-based keys are incorrectly coupled. For example this record seems to cause trouble:
005936207 4901 L $$aAmos Andersonin taidemuseon julkaisuja. Uusi sarja,$$x1795-9683 ;$$vnro 79
005936207 4901 L $$aSuomalaisen Kirjallisuuden Seuran toimituksia,$$x0355-1768 ;$$v1338
005936207 4901 L $$aAmos Andersonin taidemuseon julkaisuja. Uusi sarja,$$x0788-0138 ;$$vnro 79
005936207 830 0 L $$aAmos Andersonin taidemuseon julkaisuja.$$pUusi sarja,$$x0788-0138 ;$$vnro 77.
005936207 830 0 L $$aAmos Andersonin taidemuseon julkaisuja.$$pUusi sarja,$$x1795-9683 ;$$vnro 79.
005936207 830 0 L $$aSuomalaisen Kirjallisuuden Seuran toimituksia,$$x0355-1768 ;$$v1338.
005936207 830 0 L $$aAmos Andersonin taidemuseon julkaisuja.$$pUusi sarja,$$x0788-0138 ;$$vnro 79.
will get these series keys:
<http://urn.fi/URN:NBN:fi:bib:me:W00593620702> "amos andersonin taidemuseon julkaisuja uusi sarja" .
<http://urn.fi/URN:NBN:fi:bib:me:W00593620702> "issn:1795-9683" .
<http://urn.fi/URN:NBN:fi:bib:me:W00593620703> "amos andersonin taidemuseon julkaisuja uusi sarja" .
<http://urn.fi/URN:NBN:fi:bib:me:W00593620703> "issn:0355-1768" .
<http://urn.fi/URN:NBN:fi:bib:me:W00593620704> "issn:0788-0138" .
<http://urn.fi/URN:NBN:fi:bib:me:W00593620704> "suomalaisen kirjallisuuden seuran toimituksia" .
<http://urn.fi/URN:NBN:fi:bib:me:W00593620705> "amos andersonin taidemuseon julkaisuja uusi sarja" .
<http://urn.fi/URN:NBN:fi:bib:me:W00593620705> "issn:0788-0138" .
Out of these, at least W00593620703 is problematic: the ISSN and title don't match, The ISSN 0355-1768 is for "Suomalaisen Kirjallisuuden Seuran julkaisuja", not "Amos Andersonin taidemuseon julkaisuja" which has ISSN 0788-0138.
Opened https://github.com/lcnetdev/marc2bibframe2/issues/71 . I think the way marc2bibframe2 couples information from 490 fields with 830 fields is part of the problem, though in the case of the above record, there are also problems with the data itself (e.g. wrong ISSNs and volume numbers).
One suggested workaround is to remove during preprocessing all 490 fields from the records if a 830 field exists in the record. This way at least the values from 490 and 830 fields wouldn't be incorrectly coupled, even if it means losing some information. ISSNs should be more likely to appear in 830 fields than 490, so most of them would be retained.
Fixed by ff987f127e01360579b8738d86b7fce3a16a958a
The series W00017282900 seems to be incorrectly merged from many series. Similar to #76 and/or #70