scientist-softserv / adventist_knapsack

Apache License 2.0
1 stars 0 forks source link

🦄 Spike: Processing previously loaded content #427

Open ShanaLMoore opened 1 year ago

ShanaLMoore commented 1 year ago

Updated Ticket 2023-04-18

Story

We are trying to answer how can we ensure that previously loaded content in the digital library gets processed by the viewer? Considering SDAPI ingest and derivative Rodeo work

3. [This work](https://adl.s2.adventistdigitallibrary.org/concern/published_works/20215986_the_great_advent_movement?locale=en) originally uploaded on 1-30-23; the PDF did not reupload with the import, so the viewer didn't process the book. The TXT and thumbnail did reupload on 3/18. How can we ensure that previously loaded content in the digital library gets processed by the viewer?" ![image](https://user-images.githubusercontent.com/10081604/226653501-67b541d3-b434-4ba6-9556-85b88fdcef1c.png)

Acceptance Criteria


Old ticket ### Story Per Katharine, "I see some discrepancies in child works. Can you look at these examples and see if the issues are due to an error or due to the works still processing?" - [ ] 1. UV not rendering - [ ] 2. UV doesn't display human readable titles in the content panel - [ ] 3. Spike: How can we ensure that previously loaded content in the digital library gets processed by the viewer? Considering SDAPI ingest and derivative Rodeo work ### Acceptance Criteria 1. child works should render in the UV 2. UV should display human readable title in the content panel 3. TBD ### Screenshots / Video
1. [This work](https://adl.s2.adventistdigitallibrary.org/concern/published_works/20213922_evidence_from_scripture_and_history_of_the_second_coming_of_christ?locale=en) has an upload date of 3/14; child works show in the Items list, but nothing loads in the viewer. The child works show duplicates of pages such as page 002 and 011. ![image](https://user-images.githubusercontent.com/10081604/226654448-95d88b80-a491-4716-87a8-5624933108ae.png) ![image](https://user-images.githubusercontent.com/10081604/226652052-a48a5ec5-dd26-4f73-978e-bddc6322857a.png)
2. [This work](https://adl.s2.adventistdigitallibrary.org/concern/published_works/20217998_christian_stewardship?locale=en) uploaded 3/20. The viewer shows the thumbnail and 15 page PDF, but in the viewer each page shows with a string of numbers and letters as the "title," rather than a human readable name which I expect to see. Will this resolve? Or can we fix the titles to show as they do in the Items list? ![image](https://user-images.githubusercontent.com/10081604/226653288-465e2c86-e906-4374-bc38-9d5c8a53662a.png)
3. [This work](https://adl.s2.adventistdigitallibrary.org/concern/published_works/20215986_the_great_advent_movement?locale=en) originally uploaded on 1-30-23; the PDF did not reupload with the import, so the viewer didn't process the book. The TXT and thumbnail did reupload on 3/18. How can we ensure that previously loaded content in the digital library gets processed by the viewer?" ![image](https://user-images.githubusercontent.com/10081604/226653501-67b541d3-b434-4ba6-9556-85b88fdcef1c.png)
### Testing Instructions and Sample Files - TBD ### Notes Consider breaking these into separate tickets.
jillpe commented 1 year ago

Focus on criteria 2 for this ticket

jillpe commented 1 year ago

Point 1 and 2 has been resolved - Point 3 can be deferred until after SDAPI ingest

ShanaLMoore commented 1 year ago

Noting that we discussed the possibility of pushing this to the next phase of work or maintenance contract. This work doesn't relate to SDAPI and the deadline for ADL ingest isn't until November. cc @jillpe

jillpe commented 11 months ago

possibly helpful: https://assaydepot.slack.com/archives/C0313NJV9PE/p1689611025243569?thread_ts=1688740330.826599&cid=C0313NJV9PE

jillpe commented 11 months ago

related to

Possible to merge into one ticket?

jillpe commented 8 months ago

Potentially related to https://assaydepot.slack.com/archives/C0313NJV9PE/p1698254306742289

KatharineV commented 8 months ago

Team, today I noticed that some works have dropped out of the OAI feed for the adl:issue set. The works were present almost two years ago when my predecessor ran the first Bulkrax OAI importers. As a result, the periodical issue works have ingested and are present in the digital library repository, and the associated files are attached to the metadata, but the files (of course) don't render in the UV because they were imported looooong before the print gem.

I want you to be aware that these works have dropped from the OAI feed so that the solution for reindexing and/or rendering in the viewer does not require the works to be erased and rewritten in a manner that would be impossible now, thanks to the way the feed has changed. Our team is unable to repair the feed to reinstate the works, and I'm also sadly not able to identify why they were present in the feed and now gone.

So that you can understand what I'm saying, here are two examples.

Works are present on ADL prod:

  1. https://adl.b2.adventistdigitallibrary.org/concern/published_works/22252037_western_midnight_cry?locale=en
  2. https://adl.b2.adventistdigitallibrary.org/concern/published_works/22252165_vermont_telegraph?locale=en

Empty OAI_ADL query for each work's ID:

  1. http://oai.adventistdigitallibrary.org/OAI-script?verb=GetRecord&metadataPrefix=oai_adl&identifier=22252037
  2. http://oai.adventistdigitallibrary.org/OAI-script?verb=GetRecord&metadataPrefix=oai_adl&identifier=22252165