Closed wickr closed 1 month ago
PDFs fixed, derivatives looking good. Compounds have no errors. Only validation error left is on one with a Source field but the data is fine (verifier doesn't like the line break)
QA
Overall:
Search/indexing/faceting check:
Facets:
Work pages:
Collection pages:
[x] Collection pages are there
[x] Collection description/blurb is there
[x] Facets are there
[x] Restrictions: Double-check restricted collections and works (VPN)
[ ] Bulk Approve (step 11)
[ ] Collection-level QA: Compare number of assets on OD2 with total in original pidlist (step 12)
There is an issue with the Creator label on the URI http://id.loc.gov/authorities/names/n90718759 (label should be "Czech Republic" but instead showing some backend metadata: "Product of split:--Czech Republic--http://rdaregistry.info/Elements/u/P60685"), affecting show pages on 8 works. I attempted to manually remove and re-add the URI on 9c67wr914 which did not correct the label. Then noticed that the Data Sources tab is showing the label as expected, while the main show page has the error. This is true on the edited work and the non-edited ones.
Also note that on Browse All, six works have the erroneous label appearing in the filter list, but two works have the correct label.
The Geonames term 'Po River' in Water Basin is causing the same split-faceting issue we saw in cities in multiple counties. I reopened 2452.
A couple of weird PDF mime types showing up. Maybe worth reprocessing and replacing the files.
^ @wickr I will take a stab at the PDF replacement. I'm planning to move forward with item level QA otherwise, assuming these aren't migration issues but underlying system things.
@carakey Ok I fixed the Czech Republic label issue. Somehow that 'Product of split' string got included as another prefLabel at one point. I cleared out the blazegraph cache and refetched, and didn't see it come back. I reindexed all 8 works just to be sure and they're showing together now.
The PDF mpeg one is probably the only one I'd worry about for now, unless you want to do the EXIF one too, but that one might have EXIF metadata somewhere, so it's not really wrong.
Yeah I don't think anything else is migration-related.
thanks -- but what's the difference between pdf (PDF/A)
(# 2 with 28 works) and pdf (PDF/A, Portable Document Format)
(# 5 with 1 work)?
@carakey not sure. I'm sure we can collapse the values in indexing in the future, but so far we haven't messed with what is being extracted in characterization.
The 3 items whose PDFs I replaced, the thumbnails are not appearing on browse all / search results. The larger thumbnail is showing up ok on the (logged-in admin) work show page, and in the UV.
https://oregondigital.org/concern/documents/gb19fs332 https://oregondigital.org/concern/documents/gb19fr191 https://oregondigital.org/concern/documents/gb19fq887
I noticed on a compound child, that the linked parent title has an unresolved character encoding showing -- "His Majesty's government..." The encoding shows correctly in metadata and links elsewhere.
https://oregondigital.org/concern/documents/gb19fs528?locale=en
@wickr
In addition to the thumbnail issue in an earlier comment, there are some inconsistencies with fileset downloads. Out of the set of 18 works that I approved for QA:
Aside from these issues, things look good. Collection can be bulk approved once the filesets issues are fixed.
Reindexed all works that weren't showing thumbnails and they're showing now.
Made a ticket for the apostrophe entity reference showing: https://github.com/OregonDigital/OD2/issues/2861
Still looking at the download issues.
For the FileSets, for the 5 pids you linked, I'm seeing Download options showing up and working.
For the compound parents, I'm seeing both Standard and High Quality download options on both of them now. It's possible not all of the FileSets were indexed fully earlier.
I'm going to go ahead and bulk approve.
All works are reviewed and looking good. Counts look good too.
@carakey when you have a chance the collection homepage could use the 4 featured docs/thumbnails
@wickr I finally had a chance and added some featured docs.
Item Count: 656 items
Item Types: Document
Access Restrictions: 0
Complex Objects: 20
pid List: https://github.com/OregonDigital/OD2-migration/blob/master/freshwater-treaties/freshwater-treaties_nocpds.txt https://github.com/OregonDigital/OD2-migration/blob/master/freshwater-treaties/freshwater-treaties_cpds.txt