Closed sshetenhelm closed 5 months ago
Have written to BRBL to see if the Papryi data in Preservica Test is complete https://yul-pres-tsdb.library.yale.edu/explorer/explorer.html#browse:SO&ecd02c4b-21a9-4628-80e5-f257ede28f97&null and whether the ASpace handles are now in Test and or Prod
David and Mark are working on getting this collection into ASpace TEST and Preservica TEST so we can add it to DCS UAT.
@sshetenhelm did we want to pull this into Sprint at some point, as this work is now ongoing?
Yes, I can go ahead and slide it in.
@ucancallmealicia is the papyri metadata in test Aspace? Or is that still in progress?
Yes it's in test in the BRBL repo: https://testarchivesspace.library.yale.edu/resources/5894
DPS is still doing testing in Preservica re: the collection hierarchy.
TEST resync is complete.
@ucancallmealicia Aspace is saying YPC has an unpublished ancestor. Is that why I can't seem to find it on test Archives at Yale?
@sshetenhelm the resource record was unpublished, so yes that's why you were getting that message. I just published it so should show up in a few.
Papyri object number 1 is in! https://collections-uat.library.yale.edu/catalog/901602829
I'll start working on the others :)
Currently adding Fixity to entire collection in order to ingest in bulk.
~~Using information from ArchivesSpace report, created 6894 parents in MGMT UAT. 23 parents did not pull children -- will investigate~~
PID in Ladybird only has 6462 parents, while FindIt shows 6553 total parents.
@motropuk and @ucancallmealicia, do either of you know a definitive amount of parent objects we should have?
For objects that had zero children in MGMT, folder in Preservica appeared to have no images. Still investigating.
Based on spreadsheet papyri-2018-excel-master, was able to match and remove all parents attached to records of objects without scans.
Now have 6319 parents in Management, which is still a discrepancy with Ladybird PID 42 (6462 - 6319 = 143). Possible some LB records hierarchical? Will continue investigating.
141 objects are in FindIt but do not have TIFFs, as per legacy spreadsheet. These objects are: YPC_NoTIFFs.csv
11 objects are in FindIt but did not bring children into DCS, but should have TIFFs, as per legacy spreadsheet. These objects are: YPC_MissingScans.csv
Ingested 9 of 11 outstanding objects, thanks David for helping us locate!
In Management - 6,328 parents In Blacklight - 6,328 parents
Still investigating P.CtYBR inv. 4717(B) & P.CtYBR inv. 3679(B)
6914 Records on Mark Custer’s Spreadsheet 587 should have zero images
6914 - 587 = 6327
6328 in MGMT + 2 outstanding = 6330
Not sure how we are 3 over, but I will investigate.
Identified 5 duplicates of parents in UAT (same aspace URI, same Preservica info). Deleted. Now 6,323 parents in MGMT UAT.
6,323 + 2 still under investigation = 6,325
Now we are two objects under legacy list
Will continue to investigate
@ucancallmealicia would it be possible to move Papyri records into PROD aspace, if this has not been done already?
Update from DPS - Aspace link workflow is working, however, items not moving automatically to BRBL Preservica folder. Putting down temporarily while people are out of office, will pick back up week of 19th.
Pivoting back to this. Records are being published to Aspace as we speak, will attempt first ingest this afternoon.
Current 3,367 available in Blacklight PROD.
Current stats: 5070 in Blacklight 5646 in MGMT
stilll working
Up to 5922 in Blacklight, still working through the last few hundred
6,087 YPC parents in Blacklight
6,716 YPC parents in MGMT 625 have 0 children
6716-625 = 6091, which means 4 parents have children but are not displaying in Blacklight
Will continue to investigate.
Goal is 6,323 in Blacklight Currently 6,136 Difference of 187 objects
Still investigating.
Currently 6144 objects in Blacklight. Identified 104 objects that are in Preservica/Aspace TEST but not in Preservica/Aspace PROD. Working on reconciling.
6,253 in Blacklight
Batch Process 16838 has the following failures -
36 objects:
31 objects:
Request error 503 <?xml version="1.0" encoding="UTF-8" standalone="yes"?><Error><ExtendedMessage>Content does not exist in the storage.</ExtendedMessage><MessageKey>error.from.storage</MessageKey></Error>
CSV HERE: YPC-Failed-Broken.csv3 objects:
For the first three I tried in YPC-Failed-Broken, when I tried to right-click > Download on these objects, I received the following error message for at least one TIF in each folder:
This page isn’t working preservica.library.yale.edu is currently unable to handle this request.
HTTP ERROR 500
The three YPC-0Children-Broken objects I tried all worked though.
YPC-Failed-Broken needed a fix in Preservica; our test worked, so just waiting for the green light to ingest the rest of that batch.
For YPC-0Children-Broken, From David:
_so far the first 4-5 that I've checked from YPC-0Children-Broken.csv don't have any download events from the s_dcsbrbl account in their histories, so for those it looks like the request isn't making it to the download level
@sshetenhelm the items on YPC-Failed-Broken are ready to be re-tried.
The cause of these failures were stale file handles being presented by the storage product to Preservica, we were able to manually refresh these stale file handles. It's not clear yet why these specific files were afflicted, we are working towards a general solution.
YPC-Failed-Broken worked, but had to resync a few to get them going. Now 6,284 in PROD.
Successfully ingested the unpublished Aspace parents. Now 6,288 in Prod
The only outstanding objects are the 36 that refuse to create child objects. Will continue to investigate.
Case Study: Parent OID 33203833
Jobs for object:
Is there any way to look up Management Production jobs 13993956 and 13993957 to see if something went wonky there?
Did one more test today, having same issue. Created #2826 to address remaining 36 objects and will close this ticket. Problem described in previous comment is available in #2824 for troubleshooting purposes.
Story The Yale Papyrus Collection (PID 42) is currently in FindIt and needs to move to DCS. However, will use this collection as a test of the new Preservica DCS integration. Will ingest the collection into Preservica Test and DCS UAT first. Then when Preservica V6 is in Prod, will ingest final collection there. BRBL have said that Original parent OIDs DON'T need to be retained from Findit, so this is not longer a migration, but instead a new collection ingest using Preservica
Acceptance