OpenCTI-Platform / opencti

Open Cyber Threat Intelligence Platform
https://opencti.io
Other
6.31k stars 933 forks source link

6.3.2 Worker memory leak #8629

Open MaxwellDPS opened 2 weeks ago

MaxwellDPS commented 2 weeks ago

Description

6.3.2 seems to have introduced a memory leak in the worker. Seeing high resource use and issues with running for long peroids

Environment

  1. OS (where OpenCTI server runs): CentOS Stream 9
  2. OpenCTI version: 6.3.2
  3. OpenCTI client: Worker
  4. Other environment details: Clustered k8s

Reproducible Steps

Steps to create the smallest reproducible scenario: N/A

Expected Output

Nominal memory use and workers running un interupted

Actual Output

Runaway memory and CPU use, also seeing issues launching threads once memory is exausted

{"timestamp": "2024-10-07T15:07:51.138873Z", "level": "INFO", "name": "worker", "message": "Thread for queue not alive, creating a new one...", "taskName": null, "attributes": {"queue": "push_1f143a0c-508c-5ea8-90b5-a2b8ded3cb6a"}}
{"timestamp": "2024-10-07T15:07:51.141287Z", "level": "INFO", "name": "api", "message": "Health check (platform version)...", "taskName": null}
{"timestamp": "2024-10-07T15:07:57.453210Z", "level": "ERROR", "name": "worker", "message": "RuntimeError", "exc_info": "Traceback (most recent call last):\n  File \"/opt/opencti-worker/worker.py\", line 462, in start\n    self.consumer_threads[queue] = Consumer(\n                                   ^^^^^^^^^\n  File \"<string>\", line 10, in __init__\n  File \"/opt/opencti-worker/worker.py\", line 107, in __post_init__\n    self.ping.start()\n  File \"/usr/local/lib/python3.12/threading.py\", line 994, in start\n    _start_new_thread(self._bootstrap, ())\nRuntimeError: can't start new thread", "taskName": null, "attributes": {"reason": "can't start new thread"}}
----------------------------------------
Exception occurred during processing of request from ('192.168.6.101', 58230)
Traceback (most recent call last):
  File "/usr/local/lib/python3.12/socketserver.py", line 318, in _handle_request_noblock
    self.process_request(request, client_address)
  File "/usr/local/lib/python3.12/socketserver.py", line 706, in process_request
    t.start()
  File "/usr/local/lib/python3.12/threading.py", line 994, in start
    _start_new_thread(self._bootstrap, ())
RuntimeError: can't start new thread
----------------------------------------
----------------------------------------
Exception occurred during processing of request from ('192.168.6.101', 60524)
Traceback (most recent call last):
  File "/usr/local/lib/python3.12/socketserver.py", line 318, in _handle_request_noblock
    self.process_request(request, client_address)
  File "/usr/local/lib/python3.12/socketserver.py", line 706, in process_request
    t.start()
  File "/usr/local/lib/python3.12/threading.py", line 994, in start
    _start_new_thread(self._bootstrap, ())
RuntimeError: can't start new thread
----------------------------------------

Additional information

This is not tripping on OOM, so k8s auto recovery is not tripping

image

richard-julien commented 2 weeks ago

Hi @MaxwellDPS , do you have the logs produced by the workers during the pick (10/04 13h to 10/05 00h) ?