internetarchive / openlibrary

One webpage for every book ever published!
https://openlibrary.org
GNU Affero General Public License v3.0
5.13k stars 1.34k forks source link

Verified_fields: Allow librarians to lock certain work/edition fields #7761

Open mekarpeles opened 1 year ago

mekarpeles commented 1 year ago

Describe the problem that you'd like solved

As of:

For now, let's start with publishers and covers as fields librarians can lock

Stakeholders

@seabelis

mheiman commented 1 year ago

This is potentially useful on its own, but I'm not entirely sure that this directly addresses the issue that @seabelis is seeing.

The appending of extraneous data isn't in any way limited to editions that librarians have touched; probably the vast majority of cases are on records that no one's even noticed yet. As I understand it, what's happening is:

  1. The importer creates an edition and populates the record (hopefully correctly)
  2. A later import matches that same edition and merges in new, less useful data

As I see it, it looks like a pure import problem. There are any number of ways we could address it, such as:

  1. Configure the importer to just stop appending new data to populated fields (globally or specific fields)
  2. Establish some ranking of data sources, and use that to resolve differences (e.g. if an edition was created from an IA MARC record, then a BWB import can't modify any already populated fields)

In the abstract, I think a librarian lock for fields we don't want messed with is probably a good and useful thing, but I don't think it really resolves the problem that prompted the idea.