pulibrary / lib_jobs

Enterprise Services batch processing tasks. Rails 7 Ruby 3.1.0
4 stars 0 forks source link

Submit collection does not process records #695

Open christinach opened 6 months ago

christinach commented 6 months ago

Expected behavior

Submit Collection rake task runs daily at 12pm UTC and uploads processed records to AWS/SCSB

Actual behavior

Submit Collection rake task runs daily at 12pm UTC. 0 records are processed.

Steps to replicate

Visit lib-hobs filter by AlmaSubmitCollection The last upload was 2024-02-14 with 24657 records processed. 2024-02-16 has 0 records processed.

Impact of this bug

We don't send PUL RECAP records to SCSB.

Implementation notes, if any

Redis is not the issue since the application does not use it.

sandbergja commented 6 months ago

@maxkadel and I checked the following:

We are running the job in a tmux, and monitoring datadog as it runs.

We were also wondering why there is no data set for the 15th? Or after the 16th? or between 8-14?

sandbergja commented 6 months ago

@maxkadel and I started a tmux session running the job (submit-collection-troubleshooting is the session name). When the job is running, the box runs out of memory and kills the job, so it never completes.

sandbergja commented 6 months ago

lib-jobs-prod2 now has 32 gigs of RAM. Re-running the job now. We should also see if we can make this less memory-intensive.

christinach commented 5 months ago

closed by https://github.com/pulibrary/lib_jobs/pull/698

Discussed this further in the Rails performance goal with @sandbergja @carolyncole @rladdusaw @acozine

sandbergja commented 5 months ago

To catch up on the index, we'll:

kevinreiss commented 4 months ago

See @maxkadel's comment from the work on metering the files when we look at this again.

maxkadel commented 4 months ago

See @maxkadel's comment from the work on metering the files when we look at this again.

Actually I think I was wrong - I think the name is misleading, and right now it's an array of pointers to temp files, which I don't think should take up a lot of memory (I think maybe previously it was stringio, which was a lot of memory).