scientist-softserv / adventist_knapsack

Apache License 2.0
1 stars 0 forks source link

UV splitting skipping and jumbling pages #674

Open KatharineV opened 2 weeks ago

KatharineV commented 2 weeks ago

We just discovered UV/splitting-related errors in some works that were uploaded through SpaceStone and the periodical OAI set migration. Although I'm making a ticket now, my hope is that we can just resplit these large PDFs after we upgrade to Valkryie. If Valkyrie can handle large, complex PDFs better than Fedora, maybe the UV will properly render the works.

Problem: Large PDFs have not split correctly. Pages in the UV appear out of order and not all pages have split into child works.

Examples: https://adl.b2.adventistdigitallibrary.org/concern/published_works/20220366_seventh_day_adventist_yearbook_january_1_1952 This PDF has 504 pages, but only 423 child works appear. The PDF starts with page one in the viewer but then it skips to page 10, then page 100, then 101, and page numbers remain incorrect throughout the document. The PDF attached to the work downloads just fine and is accurate, so the problem definitely lies in the splitting and child works process.

https://adl.b2.adventistdigitallibrary.org/concern/published_works/20220414_seventh_day_adventist_yearbook_january_1_2008 This PDF has 790 pages, but only 44 child works appear. The PDF displays the pages that do render all out of order. It starts with page 1 but then skips to 10. Like the work above, the attached PDF appears fine when you download it.