Open jeremyf opened 10 months ago
Rob was able to attach a missing file to a file_set using the following code. We should be able to plug this into the process to handle the cases where the PDF is missing (except use a perform_later
). This should trigger all of the subsequent splitting jobs as well.
operation = Hyrax::Operation.create!(user: user, operation_type: "Attach Remote File")
ImportUrlJob.perform_now(file_set, operation)
Additional cleanup unrelated to PDF splitting:
We're noticing two problems with PDF ingests:
The former, not being split, is addressed in scientist-softserv/adventist_knapsack#218 and scientist-softserv/adventist-dl#689. However, we also want to consider those situations where we did not characterize the file; perhaps because it wasn't attached.
We'll need to look for some of latter situations and determine how we might be able to remedy the non-attached and/or non-characterized job.
Consider that the parent work has an AARK_ID, which we could use to re-fetch the file. It also likely has a
Bulkrax::Entry
(or two or three) that we could use to run a re-ingest the work.A better solution came from Rob.
The goal is for these FileSets without mime_types to:
Related to: