scientist-softserv / adventist_knapsack

Apache License 2.0
1 stars 0 forks source link

Bulkrax OAI importer issues #247

Closed KatharineV closed 4 months ago

KatharineV commented 8 months ago

I created two Bulkrax OAI importers on demo.adventist-knapsack. Both importers pulled 10 records from the adl:image OAI set. The importers both have status of "Complete (with failures)." The error code is ArgumentError. All 20 records came into the repo, but no files came in.

  1. First importer, selected to skip thumbnails
  2. Second importer, selected to include thumbnails

The importers did not map metadata exactly according to the ADL and Hyku Maps document.

Work on Demo (first importer): https://demo.adventist-knapsack-staging.notch8.cloud/concern/images/20119303_untitled?locale=en Work on ADL Prod (first importer): https://adl.b2.adventistdigitallibrary.org/concern/images/20119303_untitled Work's OAI metadata: http://oai.adventistdigitallibrary.org/OAI-script?verb=GetRecord&metadataPrefix=oai_adl&identifier=20119303

Work on Demo (second importer): https://demo.adventist-knapsack-staging.notch8.cloud/concern/images/20119409_may_covington?locale=en Work on ADL Prod (second importer): https://adl.b2.adventistdigitallibrary.org/concern/images/20119409_may_covington Work's OAI metadata: http://oai.adventistdigitallibrary.org/OAI-script?verb=GetRecord&metadataPrefix=oai_adl&identifier=20119409

MAPPING ISSUES:

KatharineV commented 8 months ago

Continued testing shows that all OAI sets import with status "Complete (with failures)." Consistent ArgumentError across all tests.

Checking metadata mapping for all OAI sets and work types that import via existing Adventist OAI shows same identifier and AARK identifier mapping errors and failure to import files. Other fields map as expected.

Work types tested: Published work (adl:book and adl:issue) Generic work (adl:other) Thesis work (adl:thesis)

KatharineV commented 8 months ago

Please note that importers I created for testing on 10/9 did not have files attached at the time, and no files have rendered in the repo. Today (10/11) I reran an adl:image importer and selected "reharvest." The files are still not showing up.

https://demo.adventist-knapsack-staging.notch8.cloud/importers/35?locale=en

jeremyf commented 8 months ago

Triage:

I have compared the parser mappings and Knapsack adds a handler for description.abstract import. Which means we should have the same mappings.

I'll be looking into the loading sequence as well as the difference between v5.3.0 and v5.4.1 to see what might have introduced the errors.

jeremyf commented 8 months ago

One possible bug introduced is from this PR: https://github.com/samvera-labs/bulkrax/pull/853

KatharineV commented 7 months ago

Team, I've continued testing Bulkrax imports to Knapsack, and I want to report that a CSV with a valid URL in the Related URL field has not imported the file. The importer says it failed due to an argument error, but the metadata has imported and the work is created. The file is what's missing, and it did import correctly on SDAPI staging (see links below). The title of this ticket should perhaps more accurately read "Bulkrax importer issues on Knapsack." Both OAI and CSV imports are impacted.

Knapsack (file failed to import): https://demo.adventist-knapsack-staging.notch8.cloud/importers/80?locale=en SDAPI Staging (file imported as expected): https://sdapi.s2.adventistdigitallibrary.org/importers/141?locale=en

This is critical and I would mark this ticket among the highest priorities to fix with whatever remains of our Knapsack/upgrade hours.

KatharineV commented 7 months ago

An update: As of today (11-29-2023), trying to create either a CSV or OAI importer causes an error message to appear, and nothing works. The importers don't create and fail. They just don't create.

Image

kirkkwang commented 7 months ago

Thanks for checking @KatharineV we'll take a look

kirkkwang commented 7 months ago

@KatharineV seems it was a Fedora issue, we've since restarted it and it should be working, can you check again when you get a chance?

KatharineV commented 7 months ago

@kirkkwang The CSV importer worked beautifully this time: https://adl.adventist-knapsack-staging.notch8.cloud/importers/35?locale=en

The OAI importer looks successful, but the works it created won't open. I can see them in the catalog view, but trying to open the works generates an error message. Could it be related to the Fedora issue?

image

image

image

https://adl.adventist-knapsack-staging.notch8.cloud/concern/theses/20121848_problems_in_presenting_the_gospel_to_the_hindu_mind

ShanaLMoore commented 7 months ago

@KatharineV I believe this issue has been resolved. I clicked on the link you've provided and see the work.

Image

KatharineV commented 7 months ago

@ShanaLMoore It's weirdly still not loading for me! I tried Firefox, Edge, and Chrome in case it was a browser issue. I got the "We're sorry but something went wrong" message on all three attempts.

ShanaLMoore commented 7 months ago

@KatharineV I needed to restart fedora. We are currently looking into this issue (still) but this particular work wasn't loading because it couldn't connect to fedora. Please try it again

image

KatharineV commented 7 months ago

The work page opens now, but the work still doesn't display to me as it did for you in the screenshot above. Some aspects of the page aren't loading yet. See below:

image

ShanaLMoore commented 7 months ago

Pulling this back to In Development.

Jeremy, LaRita and I confirmed that the page loads but we don't see a file. (Chrome, Safari)

I see the file attached in FireFox even after a cache refresh, and can download it too, but I'm in the minority so I think this is a moot point. I'll eventually restart all the things to check again, but LaRita and Jeremy confirmed they can't see this on their FireFox as well.

However I can find the file set in knapsack staging's rails console using find and where clauses. I can also see the solrdocument.

See slack thread for more sleuthing notes: https://assaydepot.slack.com/archives/C0311DN2YCA/p1701385487550419?thread_ts=1701364436.127349&cid=C0311DN2YCA

ShanaLMoore commented 4 months ago

This is critical and I would mark this ticket among the highest priorities to fix with whatever remains of our Knapsack/upgrade hours.

ref: https://github.com/scientist-softserv/adventist_knapsack/issues/247

It looks like this issue with the CSV import has been resolved.

On knapsack staging I created the same importer and it imported with success.

Additionally, the follow previously missing mappings are here:

![Image](https://github.com/scientist-softserv/adventist-dl/assets/10081604/90263c4a-a26c-4312-812b-a496cd6fb997)

Looking into OAI next...

ShanaLMoore commented 4 months ago

Hi @KatharineV

I'd hate to ask this but I'm wondering if you can re test this ticket and/or lay out the steps to reproduce your issue. It's a rather old ticket now, so I also totally understand if you don't remember, to which I'd suggest we close this one and create new issues as they arise.

In the above comment I successfully imported the related url csv on the demo tenant of adventist knapsack. see results

I've also successfully created an OAI article importer (limit 3). Here are the results. I've recreate the image ones too, as detailed in the description.

![Image](https://github.com/scientist-softserv/adventist-dl/assets/10081604/0e2d7238-583a-4b44-b20d-92097c8be118)

I am able to visit each imported record and download the PDFs. Additionally I see the identifier and aark identifiers displayed.

At this time it isn't clear to me what the issue is to be able to address it. Please let us know how you'd like to proceed.

KatharineV commented 4 months ago

@ShanaLMoore Thanks for the heads-up. I did retest this ticket and I'm seeing the same thing as you. Whatever was causing issues before is no longer present. I will close this ticket and create a new one if new issues arise, as you suggest.