log2timeline / plaso

Super timeline all the things
https://plaso.readthedocs.io
Apache License 2.0
1.73k stars 353 forks source link

nsrlsvr analysis plugin reaches worker memory limit #3286

Open madsumm opened 4 years ago

madsumm commented 4 years ago

Description of problem: Not really new problem, but was reported before? After numerous tries on various versions of plaso, with latest nsrlsvr (1.7.0 released), the process will stop halfway through.

Noticed the worker module "nsrlsvr", when it hits 2GB in memory usage, the "nsrlsvr" process is killed. Tried it on at least 3 differing plaso files. Only the plaso which hits less then 2GB of nsrlsvr memory use, seems to be ok.

Command line and arguments:

psort.py --analysis nsrlsvr -o null <plasofile>

Plaso version:

20201007 - installed via GIFT -> plaso-tools Tried on github version as well. Same results. Tried on 0730 version. Same results.

Operating system Plaso is running on:

Ubuntu 20.04.1 Clean install on VM

Installation method:

sudo add-apt-repository ppa:gift/stable
sudo apt-get update
sudo apt-get install plaso-tools

Debug output/tracebacks:

Traceback (most recent call last):
  File "/usr/bin/psort.py", line 99, in <module>
    if not Main():
  File "/usr/bin/psort.py", line 76, in Main
    tool.ProcessStorage()
  File "/usr/lib/python3/dist-packages/plaso/cli/psort_tool.py", line 572, in ProcessStorage
    analysis_engine.AnalyzeEvents(
  File "/usr/lib/python3/dist-packages/plaso/multi_processing/psort.py", line 949, in AnalyzeEvents
    self._AnalyzeEvents(
  File "/usr/lib/python3/dist-packages/plaso/multi_processing/psort.py", line 321, in _AnalyzeEvents
    event_queue.PushItem((event, event_data, event_data_stream))
  File "/usr/lib/python3/dist-packages/plaso/engine/zeromq_queue.py", line 457, in PushItem
    raise errors.QueueFull
plaso.lib.errors.QueueFull
joachimmetz commented 4 years ago

This is likely the analysis process reaching the worker process memory limit. Try adjusting --process-memory-limit and/or --worker_memory_limit.

When time permits I'll have a look why the nsrlsvr analysis plugin is reaching this limit in the first place.

madsumm commented 4 years ago

I had either to “0”. Same results. Testing now with setting BOTH to 0.

On 4 Nov 2020, at 1:59 PM, Joachim Metz notifications@github.com wrote:

This is likely the analysis process memory limit. Try adjusting --process-memory-limit and/or --worker_memory_limit.

When time permits I'll have a look why the nsrlsvr analysis plugin is reaching this limit in the first place.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

madsumm commented 4 years ago

Had tested with setting "0" for both --worker-memory-limit & --process-memory-limit Same result. When the worker "nsrlsvr" reaches 2GB, it is "killed"

joachimmetz commented 4 years ago

The 2G limit looks like it is caused by <plaso_xmlrpc> Unable to make RPC call with error: <Fault 1: "<class 'OverflowError'>:int exceeds XML-RPC limits"> and the worker process is killed because the status info RPC is failing.

joachimmetz commented 4 years ago

Handling plaso.lib.errors.QueueFull is part of https://github.com/log2timeline/plaso/issues/366 (moved to https://github.com/log2timeline/plaso/issues/3309) improving handling of the abort path

madsumm commented 3 years ago

Update: I was able to run nsrlsvr analysis plugin with both worker and process memory limit option set to 8gb-10gb (depending on number of events in Plaso). However, it can only run on the same VM/system the nsrlsvr application is running.

I try to use a NSRL server setup in another system and remove via --nsrlsvr-host , it gave errors. Port 9120 is open as tested.

Command: psort.py -o null --analysis nsrlsvr --nsrlsvr-host <IP> --nsrlsvr-hash md5 <plasofile>

Traceback (most recent call last):
  File "/Users/user/plaso_env/bin/psort.py", line 4, in <module>
    __import__('pkg_resources').run_script('plaso==20201007', 'psort.py')
  File "/Users/userplaso_env/lib/python3.8/site-packages/pkg_resources/__init__.py", line 665, in run_script
    self.require(requires)[0].run_script(script_name, ns)
  File "/Users/user/plaso_env/lib/python3.8/site-packages/pkg_resources/__init__.py", line 1463, in run_script
    exec(code, namespace, namespace)
  File "/Users/user/plaso_env/lib/python3.8/site-packages/plaso-20201007-py3.8.egg/EGG-INFO/scripts/psort.py", line 99, in <module>
    if not Main():
  File "/Users/user/plaso_env/lib/python3.8/site-packages/plaso-20201007-py3.8.egg/EGG-INFO/scripts/psort.py", line 76, in Main
    tool.ProcessStorage()
  File "/Users/user/plaso_env/lib/python3.8/site-packages/plaso-20201007-py3.8.egg/plaso/cli/psort_tool.py", line 572, in ProcessStorage
    analysis_engine.AnalyzeEvents(
  File "/Users/user/plaso_env/lib/python3.8/site-packages/plaso-20201007-py3.8.egg/plaso/multi_processing/psort.py", line 933, in AnalyzeEvents
    self._StartAnalysisProcesses(storage_writer, analysis_plugins)
  File "/Users/user/plaso_env/lib/python3.8/site-packages/plaso-20201007-py3.8.egg/plaso/multi_processing/psort.py", line 691, in _StartAnalysisProcesses
    process = self._StartWorkerProcess(analysis_plugin.NAME, storage_writer)
  File "/Users/user/plaso_env/lib/python3.8/site-packages/plaso-20201007-py3.8.egg/plaso/multi_processing/psort.py", line 873, in _StartWorkerProcess
    process.start()
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/process.py", line 121, in start
    self._popen = self._Popen(self)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/context.py", line 224, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/context.py", line 284, in _Popen
    return Popen(process_obj)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 32, in __init__
    super().__init__(process_obj)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/popen_fork.py", line 19, in __init__
    self._launch(process_obj)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 42, in _launch
    prep_data = spawn.get_preparation_data(process_obj._name)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/spawn.py", line 183, in get_preparation_data
    main_mod_name = getattr(main_module.__spec__, "name", None)
AttributeError: module '__main__' has no attribute '__spec__'
joachimmetz commented 3 years ago

@madsumm that is a duplicate of https://github.com/log2timeline/plaso/issues/3164 and looks like an issue with your Python installation.

thx for confirming that the limit is no longer an issue, now to figure out why such a vast amount of memory is being consumed

joachimmetz commented 3 years ago

Some changes to reduce memory usage https://github.com/log2timeline/plaso/pull/3483