Open eharkins opened 2 years ago
Last increase in memory we ask for was earlier this week: https://github.com/nextstrain/ncov-ingest/commit/5db5d2574210d60b4ae6434248af64dd3187a781. Maybe we should raise it again for now while we implement a more scalable solution?
These workflows:
keep running out of memory on AWS and being killed, e.g. https://github.com/nextstrain/ncov-ingest/runs/3764464129?check_suite_focus=true.
This likely happens during the run of https://github.com/nextstrain/ncov-ingest/blob/master/bin/transform-gisaid since it takes
gisaid.ndjson
(the raw GISAID full dataset, which is over 100GB) as input, and performs a bunch of operations on it.To avoid continually increasing the resources we ask for on the batch job, here are some ideas:
@tsibley said:
@rneher said: