pulibrary / lib_jobs

Enterprise Services batch processing tasks. Rails 7 Ruby 3.1.0
4 stars 0 forks source link

Allow large files to process without running out of memory #698

Closed sandbergja closed 5 months ago

sandbergja commented 5 months ago

This patch replaces various occurences where huge MARC files are read into memory as Strings or StringIOs. Instead, it uses:

Related to #695

I tried this out on staging, and also the main branch, with 4 files:

branch peak memory consumption memory consumption pattern time
this one 0.4 GB steady during the whole run 4 minutes 33 seconds
main 1.2 GB growing at a regular pace during the whole run 4 minutes 7 seconds

For what it's worth, there are 475 files that need to be processed in prod.