Open jrlegrand opened 3 months ago
This is also breaking my rxnorm_historical locally
Assuming it is a memory issue: https://medium.com/brexeng/debugging-and-preventing-memory-errors-in-python-e00be55e7cf2
It gets through all API calls and then silently crashes before it exists the concurrent API calls function. Assuming it has to do with ThreadPoolExecutor taking up too much memory somehow.
I think something's happening in the ThreadPoolExecutor. I added some logging and it doesn't seem to get past the for loop before stalling out forever (or giving the SIGKILL message in "PRD").
Need to try asynchronous process. Look into async.io python library.
Suspect once concurrency is up and running with as many threads it can handle, the handoff trying to reassign the threads is causing it to break. Could also be a docker issue - choking on concurrent requests.
@NTBTI FYI - this is the bug you ran into.
Problem Statement
DAG runs for a while and then errors out. Doesn't do this on my local machine. Brief googling suggests it's memory related, but hard to diagnose. See this: https://github.com/apache/airflow/issues/10435
Error message:
[2024-07-10, 11:48:05 UTC] {local_task_job.py:208} INFO - Task exited with return code Negsignal.SIGKILL
Criteria for Success
DAG runs to completion in "PRD".
Additional Information