Closed sshetenhelm closed 1 month ago
PR ready for review - https://github.com/yalelibrary/yul-dc-management/pull/1345
Preservica upload is not working in UAT. Unsure if it is due to this bug fix or the Ruby upgrade.
When attempting to upload, I received the following error:
Child Object 901622792 failed to convert PTIFF due to Expected file originals/92/90/16/22/79/901622792.tif not found.
So I guess the child image did not successfully copy over from Preservica?
Video
Will also comment in Ruby ticket.
LIKELY related to upgrade, because the second time I tried this ingest, it was with an object that had ingested successfully prior to this.
Picking this back up. Being unable to find the tif in the originals folder is potentially due to a failure in copying from preservica but could also be a mismatch in which s3 bucket it's looking for that tif in or something else entirely. Investigating cause and will hopefully have a fix soon.
After adding an ACCESS_MASTER_MOUNT default variable to the parameter store a resync of the 901622791 parent object completed successfully. I'm testing a larger sample to evaluate if that was the needed fix.
PR ready for review - https://github.com/yalelibrary/yul-dc-camerata/pull/378
Looks like the larger samples were successful as well. The above PR will make a change to a few of the templates by updating '/data' to 's3' for the access master mount.
Closed camerata PR - no. 378 - post discussion with Mike and Martin. Clarified that PR was not needed - storage should be set to /data
for deployed environments. Mike deleted the ACCESS_MASTER_MOUNT env var from AWS parameter store and I redeployed to UAT. Unfortunately, while most resync with preservica jobs succeeded I am still seeing that the original is not being found for some images. I'll need to pair with someone with access to shares at yale to debug the original not found issue further. The not being able to find the original file does seem to be a different issue than the case sensitivity issue for the checksum comparison so perhaps this is ready to be tested again.
Tried again and got the 'child image not found' failure - https://collections-uat.library.yale.edu/management/batch_processes/1880/parent_objects/901623823 I used ils as the source this time to see if it was an Aspace issue or something, but still received the failure. For some reason the child is getting created and gets the label/filename from Preservica but DCS can't pull in the image itself.
PRs ready for review - https://github.com/yalelibrary/yul-dc-management/pull/1357 && https://github.com/yalelibrary/yul-dc-management/pull/1358
Maggie, Martin, and I paired after stand up and unfortunately everything worked perfectly this time with the last parent object (901623823) we had with the original file missing issue - so that's a little tough to debug. I ran another sample of preservica resync jobs on UAT - only one child failed to generate a ptiff due to the file missing issue but when I reran the sync job it worked successfully - screenshot below. As a next step - I've prepped a PR that will give us additional insight into what is going on with the sync from preservica process and why it works intermittently. During mobbing Martin noticed the sha checksum was saved as nil on a recently synced child object so the second PR addresses this issue.
We need to ensure the child objects are being successfully created without (a) having to hit the resync button, (b) displaying a batch process failure message to the user, and (c) overwriting/ignoring any metadata input into the batch process spreadsheet.
In this 3 minute video, you can see me filling out the spreadsheet, copying the Ref ID from Preservica, uploading, viewing the batch process status page and error message, resyncing the parent object, and looking for my missing metadata (which didn’t upload at all but I didn’t notice until after the resync). I suggest watching on 1.5x or 2x speed and pausing if you need to review something closer.
The good news is the images DID come in!
EDIT:Link to Batch Process 1896 Link to Parent Object 901623850
PRs ready for review -
https://github.com/yalelibrary/yul-dc-management/pull/1359
https://github.com/yalelibrary/yul-dc-management/pull/1360 (tests passed last run but timed out in CI so should be green shortly)
PR ready for review - https://github.com/yalelibrary/yul-dc-management/pull/1361
Still seeing the missing original issue - taking back to in progress
PR ready for review - https://github.com/yalelibrary/yul-dc-management/pull/1362
Hits typo error with method but proves that the process is hitting this section with the additional logging.
PR ready for review - https://github.com/yalelibrary/yul-dc-management/pull/1363
^ fixes typo and adds wip spec
Tested with UAT batch processes 1931 and 1932.
Both batch processes remain in a state of "in progress" and don't pull in TIFFs with the 'Create' batch process. However, both successfully pulled in TIFF images after either (a) resyncing the metadata using the "Update this item" button on the parent object page, or (b) "selecting" a new thumbnail (even though the images didn't exist yet) and saving, which caused the object to update and pull in the images.
In a perfect world, we wouldn't need the additional resyncing step.
Thanks for testing, Summer & JP. Based on the testing results I have removed the retries (from prior PRs) that seemed to be creating a situation where the sync from preservica job would hang as incomplete and there was still an extra step needed to successfully pull in the original files over to the shares at yale from preservica so I have added a redo when the file is actually written to the access master path.
PR ready for review - https://github.com/yalelibrary/yul-dc-management/pull/1364/
Taking back to in progress to resolve persisting originals file missing...the /data/ path access appears to be resolved but still need to fix the /originals/ path. Have begun to schedule mobbing time to help with debugging.
PR to add more descriptive logging ready for review - https://github.com/yalelibrary/yul-dc-management/pull/1369
PR that will fix the goodjob errors we were seeing in the worker logs on Test and UAT - https://github.com/yalelibrary/yul-dc-management/pull/1372/ (it was deployed on Test and can be redeployed there if desired)
Currently investigating two things:
Story Depending on the packaging tool, checksums created for objects intended for DCS may be in either lower- or uppercase. However, our current checksum validation is case-sensitive, which results in errors for objects when they were created with specific packaging tools (ex. the PUT tool).
We should update checksum validation to be not case-sensitive.
Acceptance