orlikoski / CDQR

The Cold Disk Quick Response (CDQR) tool is a fast and easy to use forensic artifact parsing tool that works on disk images, mounted drives and extracted artifacts from Windows, Linux, MacOS, and Android devices
GNU General Public License v3.0
332 stars 51 forks source link

Manage Timeout #25

Closed garanews closed 5 years ago

garanews commented 5 years ago

During Exporting results in Kibana format to the ElasticSearch server I got this erorr:

2018-09-24 21:42:32,266 [WARNING] (MainProcess) PID:8896 <base> POST http://12.3.4:9200/case_cdqr-mydisk1/plaso_event/_bulk [status:N/A request:300.019s] Traceback (most recent call last): File "site-packages\elasticsearch\connection\http_urllib3.py", line 114, in perform_request File "site-packages\urllib3\connectionpool.py", line 649, in urlopen File "site-packages\urllib3\util\retry.py", line 333, in increment File "site-packages\urllib3\connectionpool.py", line 600, in urlopen File "site-packages\urllib3\connectionpool.py", line 388, in _make_request File "site-packages\urllib3\connectionpool.py", line 308, in _raise_timeout ReadTimeoutError: HTTPConnectionPool(host=u'1.2.3.4', port=9200): Read timed out. (read timeout=300) Traceback (most recent call last): File "tools\psort.py", line 68, in <module> File "tools\psort.py", line 54, in Main File "plaso\cli\psort_tool.py", line 561, in ProcessStorage File "plaso\multi_processing\psort.py", line 1017, in ExportEvents File "plaso\multi_processing\psort.py", line 534, in _ExportEvents File "plaso\multi_processing\psort.py", line 431, in _ExportEvent File "plaso\multi_processing\psort.py", line 594, in _FlushExportBuffer File "plaso\output\interface.py", line 118, in WriteEventMACBGroup File "plaso\output\interface.py", line 73, in WriteEvent File "plaso\output\shared_elastic.py", line 292, in WriteEventBody File "plaso\output\shared_elastic.py", line 218, in _InsertEvent File "plaso\output\shared_elastic.py", line 112, in _FlushEvents File "site-packages\elasticsearch\client\utils.py", line 73, in _wrapped File "site-packages\elasticsearch\client\__init__.py", line 1174, in bulk File "site-packages\elasticsearch\transport.py", line 312, in perform_request File "site-packages\elasticsearch\connection\http_urllib3.py", line 122, in perform_request elasticsearch.exceptions.ConnectionTimeout: ConnectionTimeout caused by - ReadTimeoutError(HTTPConnectionPool(host=u'1.2.3.4', port=9200): Read timed out. (read timeout=300)) Failed to execute script psort

Probably due to high ES load. Now I have to delete the ES index (~37M docs and ~22GB) and retry again? It should be useful to manage the timeout, like "retry if receive timeout for n° times" and maybe also manage the resume (this maybe is more complicated).

orlikoski commented 5 years ago

The error does appear to be caused by a timeout with the ES server. Tuning the ES server to have enough memory can help with that. There is a lot to optimizing ES but a good starting point is to increase the jvm.options memory. Here is a script to change that.

mem_size="2"
echo "Setting jvm.options memory size to $mem_size GB"
sudo sed -i "s/-Xms1/-Xms$mem_size/g" /etc/elasticsearch/jvm.options
sudo sed -i "s/-Xmx1/-Xmx$mem_size/g" /etc/elasticsearch/jvm.options
sudo systemctl restart elasticsearch

CDQR uses Plaso's psort to insert into the ES database. I suggest asking for the resume feature request on their page.

I would recommend not starting over from scratch, but rather, use a new index name while it's re-processing after making changes to the ES server to handle the load.