yalelibrary / YUL-DC

Preliminary issue tracking for Yale University Libraries Digital Collections project
3 stars 0 forks source link

Remove case sensitivity from Preservica checksum validation #2779

Closed sshetenhelm closed 1 month ago

sshetenhelm commented 3 months ago

Story Depending on the packaging tool, checksums created for objects intended for DCS may be in either lower- or uppercase. However, our current checksum validation is case-sensitive, which results in errors for objects when they were created with specific packaging tools (ex. the PUT tool).

We should update checksum validation to be not case-sensitive.

Acceptance

K8Sewell commented 3 months ago

PR ready for review - https://github.com/yalelibrary/yul-dc-management/pull/1345

sshetenhelm commented 2 months ago

Preservica upload is not working in UAT. Unsure if it is due to this bug fix or the Ruby upgrade.

When attempting to upload, I received the following error: Child Object 901622792 failed to convert PTIFF due to Expected file originals/92/90/16/22/79/901622792.tif not found. So I guess the child image did not successfully copy over from Preservica? Video

Will also comment in Ruby ticket.

sshetenhelm commented 2 months ago

LIKELY related to upgrade, because the second time I tried this ingest, it was with an object that had ingested successfully prior to this.

K8Sewell commented 2 months ago

Picking this back up. Being unable to find the tif in the originals folder is potentially due to a failure in copying from preservica but could also be a mismatch in which s3 bucket it's looking for that tif in or something else entirely. Investigating cause and will hopefully have a fix soon.

K8Sewell commented 2 months ago

After adding an ACCESS_MASTER_MOUNT default variable to the parameter store a resync of the 901622791 parent object completed successfully. I'm testing a larger sample to evaluate if that was the needed fix.

Image

K8Sewell commented 2 months ago

PR ready for review - https://github.com/yalelibrary/yul-dc-camerata/pull/378

Looks like the larger samples were successful as well. The above PR will make a change to a few of the templates by updating '/data' to 's3' for the access master mount.

Image

K8Sewell commented 2 months ago

Closed camerata PR - no. 378 - post discussion with Mike and Martin. Clarified that PR was not needed - storage should be set to /data for deployed environments. Mike deleted the ACCESS_MASTER_MOUNT env var from AWS parameter store and I redeployed to UAT. Unfortunately, while most resync with preservica jobs succeeded I am still seeing that the original is not being found for some images. I'll need to pair with someone with access to shares at yale to debug the original not found issue further. The not being able to find the original file does seem to be a different issue than the case sensitivity issue for the checksum comparison so perhaps this is ready to be tested again.

Image

sshetenhelm commented 2 months ago

Tried again and got the 'child image not found' failure - https://collections-uat.library.yale.edu/management/batch_processes/1880/parent_objects/901623823 I used ils as the source this time to see if it was an Aspace issue or something, but still received the failure. For some reason the child is getting created and gets the label/filename from Preservica but DCS can't pull in the image itself.

K8Sewell commented 2 months ago

PRs ready for review - https://github.com/yalelibrary/yul-dc-management/pull/1357 && https://github.com/yalelibrary/yul-dc-management/pull/1358

Maggie, Martin, and I paired after stand up and unfortunately everything worked perfectly this time with the last parent object (901623823) we had with the original file missing issue - so that's a little tough to debug. I ran another sample of preservica resync jobs on UAT - only one child failed to generate a ptiff due to the file missing issue but when I reran the sync job it worked successfully - screenshot below. As a next step - I've prepped a PR that will give us additional insight into what is going on with the sync from preservica process and why it works intermittently. During mobbing Martin noticed the sha checksum was saved as nil on a recently synced child object so the second PR addresses this issue.

Image

K8Sewell commented 2 months ago

Deployed to UAT with release v2.68.0

sshetenhelm commented 2 months ago
  1. Uploaded an object using the Preservica PUT tool.
  2. Used batch process template to create parent using Preservica information.
  3. Batch process failed with “couldn’t find child object” message.
  4. Additionally, the following fields from my ‘Create Parent Object’ batch process template were not retained:
  1. Pressed “Resync metadata” button.
  2. Refreshed page. Child objects successfully created.

We need to ensure the child objects are being successfully created without (a) having to hit the resync button, (b) displaying a batch process failure message to the user, and (c) overwriting/ignoring any metadata input into the batch process spreadsheet.

In this 3 minute video, you can see me filling out the spreadsheet, copying the Ref ID from Preservica, uploading, viewing the batch process status page and error message, resyncing the parent object, and looking for my missing metadata (which didn’t upload at all but I didn’t notice until after the resync). I suggest watching on 1.5x or 2x speed and pausing if you need to review something closer.

The good news is the images DID come in!

EDIT:Link to Batch Process 1896 Link to Parent Object 901623850

K8Sewell commented 2 months ago

PRs ready for review -

https://github.com/yalelibrary/yul-dc-management/pull/1359

https://github.com/yalelibrary/yul-dc-management/pull/1360 (tests passed last run but timed out in CI so should be green shortly)

K8Sewell commented 2 months ago

PR ready for review - https://github.com/yalelibrary/yul-dc-management/pull/1361

K8Sewell commented 2 months ago

Deployed to UAT with release v2.68.1

K8Sewell commented 2 months ago

Still seeing the missing original issue - taking back to in progress

Image

K8Sewell commented 2 months ago

PR ready for review - https://github.com/yalelibrary/yul-dc-management/pull/1362

K8Sewell commented 2 months ago

Deployed to UAT with release v2.68.2

K8Sewell commented 2 months ago

Hits typo error with method but proves that the process is hitting this section with the additional logging.

Image

K8Sewell commented 2 months ago

PR ready for review - https://github.com/yalelibrary/yul-dc-management/pull/1363

^ fixes typo and adds wip spec

sshetenhelm commented 2 months ago

Tested with UAT batch processes 1931 and 1932.

Both batch processes remain in a state of "in progress" and don't pull in TIFFs with the 'Create' batch process. However, both successfully pulled in TIFF images after either (a) resyncing the metadata using the "Update this item" button on the parent object page, or (b) "selecting" a new thumbnail (even though the images didn't exist yet) and saving, which caused the object to update and pull in the images.

In a perfect world, we wouldn't need the additional resyncing step.

K8Sewell commented 2 months ago

Thanks for testing, Summer & JP. Based on the testing results I have removed the retries (from prior PRs) that seemed to be creating a situation where the sync from preservica job would hang as incomplete and there was still an extra step needed to successfully pull in the original files over to the shares at yale from preservica so I have added a redo when the file is actually written to the access master path.

PR ready for review - https://github.com/yalelibrary/yul-dc-management/pull/1364/

K8Sewell commented 2 months ago

Deployed to UAT with release v2.68.4

K8Sewell commented 2 months ago

Taking back to in progress to resolve persisting originals file missing...the /data/ path access appears to be resolved but still need to fix the /originals/ path. Have begun to schedule mobbing time to help with debugging.

Image

K8Sewell commented 1 month ago

PR to add more descriptive logging ready for review - https://github.com/yalelibrary/yul-dc-management/pull/1369

K8Sewell commented 1 month ago

PR that will fix the goodjob errors we were seeing in the worker logs on Test and UAT - https://github.com/yalelibrary/yul-dc-management/pull/1372/ (it was deployed on Test and can be redeployed there if desired)

Currently investigating two things:

Image

Image

https://us-east-1.console.aws.amazon.com/ecs/v2/clusters/yul-dc-test/services/yul-dc-test-worker/tasks/c3139a9790a543ce9ce133c84792ed4b/logs?region=us-east-1