ucldc / rikolti

calisphere harvester 2.0
BSD 3-Clause "New" or "Revised" License
7 stars 3 forks source link

[data provider issue] UCR harvesting issues - contacted UCR #944

Closed christinklez closed 6 months ago

christinklez commented 6 months ago

Request to recreate objects (or store them in "do not publish" for the moment)

Registry ID: 85

Updated on May 22: New object here https://nuxeo.cdlib.org/nuxeo/nxdoc/default/7cb9e861-588c-48dc-8ca7-74d1bfcb4db2/view_documents

Updated on May 24: The went with the second option, to keep the text doc, and replace the main file with a PDF. (TIF included as a component object.)

Note: If they are unable to review these at the moment, let's ask if they can create “do not publish” and “publish” folders within this collection's folder, and move the objects to their corresponding folders. Please let us know, and we will reharvest from the updated folder.

Project folder clean-up

Registry ID: 27414

Update on May 22: They created "published" & "unpublished" folders; updated harvest extra data with the new Nuxeo path.

ARK identifier conflicts

Registry ID: 26531 & 26750 (two records in two different collections share an ARK)

Update on May 24: The records with ARK conflicts have been moved to "unpublished" folders; UCR will continue to investigate these issues. In the meantime:

Registry ID: 26531 & 26758 (two records in two different collections share an ARK)

Update on May 24: The records with ARK conflicts have been moved to "unpublished" folders; UCR will continue to investigate these issues. In the meantime:

Registry ID: 26531 (two records within the same collection share an ARK)

Update on May 24: The records with ARK conflicts have been moved to "unpublished" folders; UCR will continue to investigate these issues. In the meantime:

Thumbnail cropping (just an fyi)

We're getting oddly cropped thumbnails. It may be that the first page is a different size from the other pages in the PDF.

Registry ID: 27470 - FYI (no action requested)

Registry ID: 26758 - FYI (no action requested)

Update on May 24: Zoomed with Krystal and Natalie and showed them the weirdly cropped thumbnails. They are okay with this for now, but perhaps we can look at the thumbnailer again post-mvp.

christinklez commented 6 months ago

For project folders & records cleanup: https://help.oac.cdlib.org/a/tickets/138406

For ARK identifiers: https://help.oac.cdlib.org/a/tickets/138407

Did not message about the cropping.

christinklez commented 6 months ago

UCR's notes on their ARK conflict investigation: https://docs.google.com/document/d/1x9_ocEtEQBmXtbKpWBiJFgruF5JbNN2GgyKhTinaiU4/edit#heading=h.6hpd3zl8q867

Discussed on 5/24:

christinklez commented 6 months ago

Collections 85 & 27414: Both collections have both been reharvested. They are done. UCR fixed the Nuxeo doc type/file format issues, and conducted some project folder cleanup.

Collections 26750, 26531, & 26758: All three collections have been reharvested. UCR has moved records with ARK conflicts into a separate "unpublished" folder so they can investigate the issues. They will request reharvesting of these collections once the ARK conflict issues have been resolved.

Collections 27470 & 26758: Shared the fyi about the weird thumbnail cropping. This is not an issue for them. However, we should investigate this post-mvp. See #965 (post-mvp board)