Open MaxwellDPS opened 2 weeks ago
6.3.2 seems to have introduced a memory leak in the worker. Seeing high resource use and issues with running for long peroids
Steps to create the smallest reproducible scenario: N/A
Nominal memory use and workers running un interupted
Runaway memory and CPU use, also seeing issues launching threads once memory is exausted
{"timestamp": "2024-10-07T15:07:51.138873Z", "level": "INFO", "name": "worker", "message": "Thread for queue not alive, creating a new one...", "taskName": null, "attributes": {"queue": "push_1f143a0c-508c-5ea8-90b5-a2b8ded3cb6a"}} {"timestamp": "2024-10-07T15:07:51.141287Z", "level": "INFO", "name": "api", "message": "Health check (platform version)...", "taskName": null} {"timestamp": "2024-10-07T15:07:57.453210Z", "level": "ERROR", "name": "worker", "message": "RuntimeError", "exc_info": "Traceback (most recent call last):\n File \"/opt/opencti-worker/worker.py\", line 462, in start\n self.consumer_threads[queue] = Consumer(\n ^^^^^^^^^\n File \"<string>\", line 10, in __init__\n File \"/opt/opencti-worker/worker.py\", line 107, in __post_init__\n self.ping.start()\n File \"/usr/local/lib/python3.12/threading.py\", line 994, in start\n _start_new_thread(self._bootstrap, ())\nRuntimeError: can't start new thread", "taskName": null, "attributes": {"reason": "can't start new thread"}} ---------------------------------------- Exception occurred during processing of request from ('192.168.6.101', 58230) Traceback (most recent call last): File "/usr/local/lib/python3.12/socketserver.py", line 318, in _handle_request_noblock self.process_request(request, client_address) File "/usr/local/lib/python3.12/socketserver.py", line 706, in process_request t.start() File "/usr/local/lib/python3.12/threading.py", line 994, in start _start_new_thread(self._bootstrap, ()) RuntimeError: can't start new thread ---------------------------------------- ---------------------------------------- Exception occurred during processing of request from ('192.168.6.101', 60524) Traceback (most recent call last): File "/usr/local/lib/python3.12/socketserver.py", line 318, in _handle_request_noblock self.process_request(request, client_address) File "/usr/local/lib/python3.12/socketserver.py", line 706, in process_request t.start() File "/usr/local/lib/python3.12/threading.py", line 994, in start _start_new_thread(self._bootstrap, ()) RuntimeError: can't start new thread ----------------------------------------
This is not tripping on OOM, so k8s auto recovery is not tripping
Hi @MaxwellDPS , do you have the logs produced by the workers during the pick (10/04 13h to 10/05 00h) ?
Description
6.3.2 seems to have introduced a memory leak in the worker. Seeing high resource use and issues with running for long peroids
Environment
Reproducible Steps
Steps to create the smallest reproducible scenario: N/A
Expected Output
Nominal memory use and workers running un interupted
Actual Output
Runaway memory and CPU use, also seeing issues launching threads once memory is exausted
Additional information
This is not tripping on OOM, so k8s auto recovery is not tripping