PaloAltoNetworks / SafeNetworking

Read only mirror. To contribute or submit issues, please go to the website link --->
https://gitlab.com/panw-gse/as/SafeNetworking/
Apache License 2.0
12 stars 10 forks source link

During heavy load, ES times out and processing halts #71

Closed punisherVX closed 5 years ago

punisherVX commented 5 years ago

When ES is under heavy load or pauses due to garbage collection, the default timeout (10s) is not enough and when it times out all SFN processing halts and never starts again. See stack trace:

GET http://localhost:9200/threat-*/_search [status:N/A request:10.018s]
Traceback (most recent call last):
  File "/home/ubuntu/safe-networking/sfn-env/lib/python3.6/site-packages/urllib3/connectionpool.py", line 387, in _make_request
    six.raise_from(e, None)
  File "<string>", line 2, in raise_from
  File "/home/ubuntu/safe-networking/sfn-env/lib/python3.6/site-packages/urllib3/connectionpool.py", line 383, in _make_request
    httplib_response = conn.getresponse()
  File "/usr/lib/python3.6/http/client.py", line 1331, in getresponse
    response.begin()
  File "/usr/lib/python3.6/http/client.py", line 297, in begin
    version, status, reason = self._read_status()
  File "/usr/lib/python3.6/http/client.py", line 258, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
  File "/usr/lib/python3.6/socket.py", line 586, in readinto
    return self._sock.recv_into(b)
socket.timeout: timed out
punisherVX commented 5 years ago

There are a few ways to try and fix this. Set timeout to more than 10s (as documented here ) or figure out why GC is taking so long and fix that.

punisherVX commented 5 years ago

This is fixed in fbc36f06ce9087ba0ab1c3e72801a5156c2f4846