Open ShanaLMoore opened 7 months ago
from Rob:
to start issue import jobs from the Rails console:
rails c
switch! 'adl'
batch = 5
GoodJob::Job.where(scheduled_at: DateTime.parse("2040-01-01 00:00:00")).limit(batch).update_all(scheduled_at: nil)
batch size can be increased as we get confidence that things are working.
I put them all as scheduled for 2040-01-01 so we should have to worry about them starting on their own
Test batches that failed with entries needing to be rerun:
TODO - re split or re run importer for books?
Problems I am seeing:
Jobs usually, but not always, appear to have imported correctly.
Can obtain a fileset id to identify which work threw the error. Re-running one ended normally.
Unable to link to a failing work from the error. Job log shows error but not included in message sent thru mailboxer.
In this case, no file is attached to the fileset.
Update 12/2/2023: Rob added additional logging that appears in the mailboxer notifications.
No errors show in jobs... Errors show up in the dashboard notifications. Can query notifications for a list: Mailboxer::Notification.where("subject like ?", "%Error%")
The errors come from ActiveFedora finder_methods.rb
method load_from_fedora
The majority of these still have a file attached to the fileset but no pdf splitting.
Update 12/2/2023: Rob added additional logging that appears in the mailboxer notifications.
_This seems to be a failure in iiifprint, where the local PDF file doesn't hang around long enough.
Update:: Put in a patch to iiif_print to make sure we have the PDF file available at the appropriate location when we try to split
The jobs retry 5 times and then end with no failure. This doesn't cause any error but definitely is a waste of processing time.
Update: Patch put in place 11/3/2023 to prevent these.
_Appeared as no method error calling iiif_printconfig. Can obtain the file set ID from the job to identify which work failed.
Occasionally submitted with null arguments.
These reschedule themselves 5 times and then end normally. Not sure what errors trigger the null arguments.
The config to clean out completed jobs does not seem to be working, so in the meantime, run this periodically in the tenant's console:
jobs = GoodJob::Job.where(error: nil).where("finished_at < ?", 7.days.ago)
jobs.find_each(&:destroy)
Notifications are overwhelming and timing out the page, so successful ones need to be cleaned up periodically. Note that this doesn't address Mailboxer::Receipt entries, but it does enable the notifications page to be opened again.
Inside the console for a tenant:
notifs = Mailboxer::Notification.where(subject: "Passing batch create")
convos = Mailboxer::Conversation.where(subject: "Passing batch create")
notifs.find_each(&:destroy)
convos.find_each(&:destroy)
Rob to provide a script to run in rancher.
Dev should run it, one at a time, and not start the next one until the jobs have finished.