Got a bunch of these on rockfish, and I don't think we're helping ourselves by calling os.listdir every 5ms:
2024-06-11 12:42:09,485 ERROR [pulsar.managers.stateful][[manager=rockfish]-[action=monitor]] Failure in stateful manager monitor step.
Traceback (most recent call last):
File "/data/nekrut/galaxy/main/pulsar/venv/lib/python3.9/site-packages/pulsar/managers/stateful.py", line 364, in _run
self._monitor_active_jobs()
File "/data/nekrut/galaxy/main/pulsar/venv/lib/python3.9/site-packages/pulsar/managers/stateful.py", line 369, in _monitor_active_jobs
active_job_ids = self.stateful_manager.active_jobs.active_job_ids()
File "/data/nekrut/galaxy/main/pulsar/venv/lib/python3.9/site-packages/pulsar/managers/stateful.py", line 310, in active_job_ids
job_ids = os.listdir(target_directory)
OSError: [Errno 23] Too many open files in system: '/scratch4/nekrut/galaxy/main/pulsar/var/rockfish-active-jobs'
2024-06-11 12:42:09,489 ERROR [pulsar.managers.stateful][[manager=rockfish]-[action=monitor]] Failure in stateful manager monitor step.
Traceback (most recent call last):
File "/data/nekrut/galaxy/main/pulsar/venv/lib/python3.9/site-packages/pulsar/managers/stateful.py", line 364, in _run
self._monitor_active_jobs()
File "/data/nekrut/galaxy/main/pulsar/venv/lib/python3.9/site-packages/pulsar/managers/stateful.py", line 369, in _monitor_active_jobs
active_job_ids = self.stateful_manager.active_jobs.active_job_ids()
File "/data/nekrut/galaxy/main/pulsar/venv/lib/python3.9/site-packages/pulsar/managers/stateful.py", line 310, in active_job_ids
job_ids = os.listdir(target_directory)
OSError: [Errno 23] Too many open files in system: '/scratch4/nekrut/galaxy/main/pulsar/var/rockfish-active-jobs'
2024-06-11 12:42:09,494 ERROR [pulsar.managers.stateful][[manager=rockfish]-[action=monitor]] Failure in stateful manager monitor step.
Traceback (most recent call last):
File "/data/nekrut/galaxy/main/pulsar/venv/lib/python3.9/site-packages/pulsar/managers/stateful.py", line 364, in _run
self._monitor_active_jobs()
File "/data/nekrut/galaxy/main/pulsar/venv/lib/python3.9/site-packages/pulsar/managers/stateful.py", line 369, in _monitor_active_jobs
active_job_ids = self.stateful_manager.active_jobs.active_job_ids()
File "/data/nekrut/galaxy/main/pulsar/venv/lib/python3.9/site-packages/pulsar/managers/stateful.py", line 310, in active_job_ids
job_ids = os.listdir(target_directory)
OSError: [Errno 23] Too many open files in system: '/scratch4/nekrut/galaxy/main/pulsar/var/rockfish-active-jobs'
Got a bunch of these on rockfish, and I don't think we're helping ourselves by calling os.listdir every 5ms: