sul-dlss / searchworks_traject_indexer

indexing MARC, MODS, and more for SearchWorks
Other
6 stars 1 forks source link

Handle FOLIO deletes in the indexer #663

Closed cbeer closed 1 year ago

cbeer commented 2 years ago

When a record is deleted from FOLIO (or marked as suppressed from discovery? or becomes otherwise invalid?), the record should be removed from Solr.

thatbudakguy commented 2 years ago

still grappling with FOLIO's distinction between instance record and source record, but it seems like it's important to understand in the context of deletes. to my knowledge:

for source records (MARC) at the /source-storage/stream/source-records endpoint:

thatbudakguy commented 1 year ago

old PR? https://github.com/sul-dlss/searchworks_traject_indexer/pull/707

thatbudakguy commented 1 year ago

we need to confirm (with @ahafele) how records are "deleted" (marked suppressed, MARC edited, etc.) in FOLIO now so that we can cover all the cases and find some records to test with.

is it OK for us to mark things as suppressed, etc. for testing? are there good records for us to do this with?

ahafele commented 1 year ago

Right now records will be "deleted" by marking them as suppressed at the Instance level. Here is an example - https://folio-test.stanford.edu/inventory/view/0c7e5994-fce7-5a85-a715-384146ec3d3f You are welcome to mark additional records as suppressed for further testing or if you prefer I can grab a test set, mark them for SW testing and then mark them all as suppressed - any other criteria to consider?

We may consider adding a stat code for DELETE, but I assume SW only cares about suppression and not the reason for suppression. In the future there will be an explicit "mark for deletion" flag but not in Nolana.

Open related question on our end - how will SW handle instance records with ALL items suppressed - will it assume suppression at the instance level or will it need to be explicit?

cbeer commented 1 year ago

Open related question on our end - how will SW handle instance records with ALL items suppressed - will it assume suppression at the instance level or will it need to be explicit?

I think we'd strongly prefer explicit suppression, especially now that some instances (on-order, electronic, mhlds) are expected to lack items.

jcoyne commented 1 year ago

Test plan:

jcoyne commented 1 year ago

Executed test plan (PENDING)

I ran this query:

record = Traject::FolioPostgresReader.find_by_catkey('a14598667', 'postgres.url'=>'<uri>')
 record.instance['suppressFromDiscovery']
=> true

So it seems like the UI is setting the correct thing in the database.

jcoyne commented 1 year ago

This now appears to be working. Status update are showing up in searchworks ~20 min after the hour.