In the last 12 months, there has been a huge increase in the creation of corrupted author records, almost always from BWB "promise" or "pallet" imports. The bad records come in a variety of forms including containing multiple author names in a single record, containing roles in the name (e.g. editor, translator), all lower case, etc. Also, since BWB authors apparently never include dates, strong identifiers, or any other type of disambiguating information, a large number of duplicate author records are being created.
Below are two different corrupted forms create for the same pair of authors, both of whom already have existing records in OpenLibrary. Not only do the author records exist, but this exact edition was already cataloged and scanned 15 years ago, but it's impossible to match due to metadata corruption.
In the last 12 months, there has been a huge increase in the creation of corrupted author records, almost always from BWB "promise" or "pallet" imports. The bad records come in a variety of forms including containing multiple author names in a single record, containing roles in the name (e.g. editor, translator), all lower case, etc. Also, since BWB authors apparently never include dates, strong identifiers, or any other type of disambiguating information, a large number of duplicate author records are being created.
Below are two different corrupted forms create for the same pair of authors, both of whom already have existing records in OpenLibrary. Not only do the author records exist, but this exact edition was already cataloged and scanned 15 years ago, but it's impossible to match due to metadata corruption.
Evidence / Screenshot (if possible)
Thaddeus Eddy Samuel; Surber
Eddy, Samuel Surber, Thaddeus,
Relevant url?
https://openlibrary.org/books/OL45868829M https://openlibrary.org/books/OL45991226M
Proposal & Constraints
The BWB importer should be banned from creating new author records until it can do so with a quality on par with those created from MARC records.
New metadata sources should undergo a quality audit before being integrated into the production system.
Stakeholders
@mekarpeles