scientist-softserv / louisville-hyku

Other
0 stars 0 forks source link

[3] NEW OCR: Install and enable Newspaper Gem #108

Closed orangewolf closed 1 year ago

orangewolf commented 1 year ago

Testing

orangewolf commented 1 year ago

Linked Items

EPIC 2 - Newspaper Work

orangewolf commented 1 year ago

Shana January 2022

Dev Notes: NNP QA - Diem includes an example for how she tested this: newman-numismatic-portal#303 So far when testing in dev, I verified that OCR is generated. We are able to search for text in the home page and find the related work. https://share.getcloudapp.com/4guZZnlW However, I've also noticed that we don't have a universal viewer and created a ticket for that. I believe we will need to have the ability of searching in the UV for the exact word.

orangewolf commented 1 year ago

Shana January 2022

No code changes have been needed, since pulling in hyku update. Tested on STAGING:

NOTE: The UV is tiny for some reason. However, I was able to OCR functionality by expanding the UV to full screen.

Image

txt example

Image

Search for OCR txt: "Messrs"

Image

orangewolf commented 1 year ago

Crystal January 2022

Test with manual upload for a work.

orangewolf commented 1 year ago

Summer January 2022

I am able to pass this on staging:

I made a work here- https://lv-hyku-staging.notch8.cloud/concern/generic_works/024c40a1-5355-460c-89a5-cc92e2152fc0

uploaded a panera bread menu pdf.

Searched for "Avocado", a string that is contained in the pdf and it returned the work successfully.

Image

Notch8 QA: Passed

orangewolf commented 1 year ago

Rachel February 2022

The Leader test in Shana's screenshot doesn't appear to be on this sandbox, so I was unable to test it.

orangewolf commented 1 year ago

Rachel February 2022

The Cardinal example from 2/18/2022 is great! I only provided a PDF to you for this (although we also have image files for each page). I take it this means that a PDF will suffice? Still trying to determine the best, most efficient approach for upload, since filling in page-level source identifiers and file names for a collection of this size will be arduous if JPEGs are preferred.

orangewolf commented 1 year ago

Alisha March 2022

@rachel.howard are you ok with closing this? the install does appear to have happened correctly, but we are aware of the uv search issue with ocr and it's being tracked on #90@rachel.howard are you ok with closing this? the install does appear to have happened correctly, but we are aware of the uv search issue with ocr and it's being tracked on #90 (closed)