scientist-softserv / oral-history

UCLA LIBRARY-CENTER FOR ORAL HISTORY RESEARCH --Documenting the histories of Los Angeles-- The UCLA Library creates a vibrant nexus of ideas, collections, expertise, and spaces in which users illuminate solutions for local and global challenges. We constantly evolve to advance UCLA’s research, education, and public service mission by empowering and
https://oralhistory.library.ucla.edu/
0 stars 0 forks source link

How does the script that deletes work?#65 #9

Closed labradford closed 1 year ago

labradford commented 1 year ago

UCLA developers would like more info on deletion error and manual delete solution (some context: https://uclalibrary.slack.com/archives/CFZ8P3Z98/p1607126162128500?thread_ts=1607114900.122500&cid=CFZ8P3Z98) “Does deletion cover any updates done to the original record, like updating/deleting/adding metadata values? I tried testing these operations on Dan curry interview, and reran the manual import, but the record is still stale on test site.”

labradford commented 1 year ago

Lea Ann Bradford April 2022

Deleting Items from the Application

When the delete parameter is passed into the import function, the remove_deleted_records method is run during the import process. The Delete parameter is passed into the import function when clicking the import button from the Admin page.
During the import process, a list of records is kept in an array called new_record_ids
Once all of the items have been processed during the import, the app will look up all of the existing records that are in the Solr index by calling the SolrService.all_ids method.
Both lists of ids are added to log files (new_record_ids and all_ids)
The app compares the list of new_record_ids to the list of all_ids
Any records in the all_ids list that were not in the new_record_ids list are sent to a log file, these are the list of items that should be deleted. These are the records that were indexed into Solr during a previous import, but are no longer in the OAI feed
The SolrService.delete_by_id method is called to remove these items

labradford commented 1 year ago

Lea Ann Bradford April 2022

Slack thread referenced above:

The disappearing interviews are back. We were seeing a difference in total number of assets between our test and pre-production environments. I ran the import by hand, and our test environment now reports four total assets. https://oralhistory-test.library.ucla.edu/ (4) vs https://oralhistory-preprod.library.ucla.edu/ (1,149 + 400). We saw this yesterday, but I thought it was a fluke and simply re-ran the import. 9:35 The four is interesting - as the mismatch was 1149 / 396; or short four assets with media.

Anyone have thoughts as to what the problem could be? 13 replies

Crystal Richardson 1 year ago Rob thinks that this still relates to the deleting of jobs and the lack of reliability related to the OAI feed. He would like to push out a change were instead of the delete happening automatically, it would only happen when it was manually triggered or at least on a less frequent timeline.

Crystal Richardson 1 year ago If you think this is a good solution he can push out that code this evening.

John H. Robinson, IV 1 year ago Yes - This is a good solution. :+1: 1

John H. Robinson, IV 1 year ago If this helps troubleshooting - I looked at the to_delete.json and there were four records in it. One of them is 21198-zz002hxhng I can see it on our pre-prod and the notch8 staging, but it’s a 404 on -test https://web.oh.staging.notch8network.com/catalog/21198-zz002hxhng https://oralhistory-preprod.library.ucla.edu/catalog/21198-zz002hxhng https://oralhistory-test.library.ucla.edu/catalog/21198-zz002hxhng

T-Kay Sangwand 1 year ago hi @crystal, @Parinita Mulak has a question about the manual delete solution. can someone provide more detail?

Parinita Mulak 1 year ago Hi @crystal what will trigger the manual delete?

Parinita Mulak 1 year ago Also what is the issue with the OAI-feed? is our server not available?

Crystal Richardson 1 year ago I am going to take a shot at explaining this, but I want to make it clear that the final implementation may be slightly different. The OAI-feed has had some interruptions while sending data (this is the way it was explained to me in laymen's terms), so some of the records do not get sent and the system interprets that as they have been removed and need to be deleted. If you know that records have been removed you can run a delete function that will do what the importer does now. It will probably exists on the admin dashboard as either be an "import and delete" button or a "delete records removed from OAI" button or will be something run on the command line. If the OAI feed were to accidentally skip a couple or records during this phase, some extra records would be deleted, but the next time you ran the importer it would bring all of them back in. The importer should be set to run nightly, so you will only have records missing for no longer than a day. The importer will no longer delete records that have been removed, so if the feed gets interrupted and misses a couple records it will no longer remove them, and you will not be seeing the number discrepancy that John has been finding. @rob bringing this to your attention so that if I explained this terribly or the plan is different that you can weigh in.

Parinita Mulak 1 year ago Thank you @crystal for the details, I do have a better picture of the solution now. The separation of deletion feature from the importer logic will only update the existing records and add new records. But the manual deletion will have the same issues due to unreliability of OAI feed, When you get a chance can you forward the errors with the OAI feed?

T-Kay Sangwand 1 year ago Thanks for the explanation, @crystal! Will someone ping the channel when the code is updated so we know? @Parinita Mulak, does this solution sound reasonable to you?

John H. Robinson, IV 1 year ago Notch8 said this evening. I expect that to be around 8pm. Rob works late, even on Fridays. :+1::skin-tone-5: 1

Rob Kaufman 1 year ago c571990b is deployed to the N8 staging env. seems to be working correctly

John H. Robinson, IV 1 year ago Update pushed to -test; re-ran the import; all assets appeared (1149+401=1550); pushed to -preprod; the dropdown retracting when the text is activated is also fixed (ty!)