google / timesketch

Collaborative forensic timeline analysis
Apache License 2.0
2.52k stars 577 forks source link

MemoryError during timeline import #3111

Open mister-turtle opened 2 weeks ago

mister-turtle commented 2 weeks ago

Describe the bug Currently, when using the version installed by "deploy_timesketch.sh" it's not possible for me to upload via the WebUI. Using the "timesketch-cli-client" or the "timesketch_importer", I get the following error showing in the UI detailing a MemoryError. I have 24GB RAM assigned to this VM, there is no OOM kill in dmesg, and watching memory consumption shows a reasonable amount still free:

Original filename: XXXXX
File on disk: /usr/share/timesketch/upload/6652b05d4ee3431fae29702c674298bd
    File size: 944.85MB
    Uploaded by: tester
    Provider: Timesketch CLI client
    Context: /home/tester/client/XXXXX/XXXXX/XXXXX.plaso
    Data label: plaso
    Status: fail
    Total File Events: 1.2M
    Error message: 2024-06-13 12:16:02,337 [INFO] (MainProcess) PID:20 <tool_options> Custom formatter definitions path: /etc/timesketch/plaso_formatters.yaml 2024-06-13 12:16:02,339 [INFO] (MainProcess) PID:20 <opensearch_ts> Timeline identifier: 3 Traceback (most recent call last): File "/usr/bin/psort.py", line 33, in <module> sys.exit(load_entry_point('plaso==20240308', 'console_scripts', 'psort')()) File "/usr/lib/python3/dist-packages/plaso/scripts/psort.py", line 78, in Main tool.ProcessStorage() File "/usr/lib/python3/dist-packages/plaso/cli/psort_tool.py", line 509, in ProcessStorage output_engine.ExportEvents( File "/usr/lib/python3/dist-packages/plaso/multi_process/output_engine.py", line 491, in ExportEvents self._ExportEvents( File "/usr/lib/python3/dist-packages/plaso/multi_process/output_engine.py", line 298, in _ExportEvents self._ExportEvent( File "/usr/lib/python3/dist-packages/plaso/multi_process/output_engine.py", line 193, in _ExportEvent self._FlushExportBuffer( File "/usr/lib/python3/dist-packages/plaso/multi_process/output_engine.py", line 373, in _FlushExportBuffer output_module.WriteFieldValuesOfMACBGroup( File "/usr/lib/python3/dist-packages/plaso/output/interface.py", line 116, in WriteFieldValuesOfMACBGroup self.WriteFieldValues( File "/usr/lib/python3/dist-packages/plaso/output/interface.py", line 103, in WriteFieldValues self._WriteFieldValues(output_mediator, field_values) File "/usr/lib/python3/dist-packages/plaso/output/opensearch_ts.py", line 48, in _WriteFieldValues self._FlushEvents() File "/usr/lib/python3/dist-packages/plaso/output/shared_opensearch.py", line 289, in _FlushEvents self._client.bulk(**bulk_arguments) File "/usr/local/lib/python3.10/dist-packages/opensearchpy/client/utils.py", line 179, in _wrapped return func(*args, params=params, headers=headers, **kwargs) File "/usr/local/lib/python3.10/dist-packages/opensearchpy/client/__init__.py", line 411, in bulk return self.transport.perform_request( File "/usr/local/lib/python3.10/dist-packages/opensearchpy/transport.py", line 370, in perform_request status, headers_response, data = connection.perform_request( File "/usr/local/lib/python3.10/dist-packages/opensearchpy/connection/http_urllib3.py", line 263, in perform_request self.log_request_fail( File "/usr/local/lib/python3.10/dist-packages/opensearchpy/connection/base.py", line 272, in log_request_fail body = body.decode("utf-8", "ignore") MemoryError 

Timesketch Plaso Version:

$ docker exec -it timesketch-worker psort.py --version
plaso - psort version 20240308

Plasofile created with:

$ docker run --rm -it log2timeline/plaso log2timeline.py --version
plaso - log2timeline version 20240308

Timesketch-cli-client

$ timesketch --version
Timesketch CLI, version 20230721

To Reproduce Steps to reproduce the behavior:

  1. sudo deploy_timesketch.sh
  2. Create user
  3. python -m venv venv
  4. . ./venv/bin/activate
  5. python -m pip install timesketch-import-client
  6. timesketch_importer

Expected behavior Plaso file should import into timesketch without issue.

mister-turtle commented 2 weeks ago

As I've seen it asked on a similar bug report, here is pinfo successfully running a failing plasofile in the timesketch-worker container (NB: same plasofile, different upload attempt after removing containers and pulling latest versions):

root@86ddf9c455ef:/usr/share/timesketch/upload# pinfo.py --output-format json --sections events ./1be4edafed434d67a98555b43f82eed2 | jq . -
{
  "storage_counters": {
    "parsers": {
      "filestat": 155033,
      "total": 1215009,
      "pe": 1096,
      "setupapi": 1752,
      "winevtx": 353190,
      "prefetch": 2230,
      "winpca_dic": 48,
      "olecf_document_summary": 4,
      "olecf_summary": 13,
      "olecf_default": 65,
      "winreg_default": 493259,
      "windows_sam_users": 10,
      "windows_typed_urls": 4,
      "msie_zone": 48,
      "windows_run": 9,
      "shell_items": 39861,
      "lnk": 30139,
      "amcache": 6249,
      "utmp": 465,
      "recycle_bin": 2,
      "oxml": 27,
      "onedrive_log": 4728,
      "mrulistex_string": 44,
      "mrulistex_string_and_shell_item_list": 1,
      "mrulistex_shell_item_list": 11,
      "mrulist_string": 38,
      "explorer_mountpoints2": 4,
      "mrulistex_string_and_shell_item": 49,
      "explorer_programscache": 2,
      "userassist": 129,
      "bagmru": 1,
      "windows_boot_execute": 2,
      "appcompatcache": 1024,
      "windows_timezone": 1,
      "windows_shutdown": 2,
      "windows_usb_devices": 12,
      "windows_services": 758,
      "bam": 52,
      "windows_version": 3,
      "networks": 8,
      "windows_task_cache": 797,
      "winlogon": 4,
      "olecf_automatic_destinations": 1278,
      "chrome_66_cookies": 5781,
      "chrome_autofill": 26692,
      "chrome_27_history": 72057,
      "chrome_cache": 18014,
      "google_analytics_utma": 5,
      "google_analytics_utmz": 4,
      "google_analytics_utmb": 3,
      "google_analytics_utmt": 1
    },
    "event_labels": {}
  }
}
jkppr commented 2 weeks ago

Hi @mister-turtle can you please provide the following additional information:

mister-turtle commented 2 weeks ago

Hi @jkppr ,

tsctl info

$ docker exec -it timesketch-web tsctl info
Timesketch version: 20240508.1
plaso - psort version 20240308

Node not installed. Node is only used in the dev environment.
npm not installed. npm is only used in the dev environment.
yarn not installed. Yarn is only used in the dev environment.
Python version: Python 3.10.12

pip version: pip 22.0.2 from /usr/lib/python3/dist-packages/pip (python 3.10)

Plasofile created with:

$ docker run --rm -it log2timeline/plaso log2timeline.py --version
plaso - log2timeline version 20240308

This is being run on Ubuntu 22.04

$ cat /etc/*release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=22.04
DISTRIB_CODENAME=jammy
DISTRIB_DESCRIPTION="Ubuntu 22.04.4 LTS"

As for the error in the log, I'm not sure anything ever gets to the backend, as the log only has:

[2024-06-13 14:38:57 +0000] [10] [INFO] Booting worker with pid: 10
[2024-06-13 14:38:57 +0000] [11] [INFO] Booting worker with pid: 11
[2024-06-13 14:42:42,432] timesketch.analyzers.misp/ERROR MISP conf not found
[2024-06-13 14:42:42,440] timesketch.analyzers.hashlookup/ERROR Hashlookup conf not found

The Dev Tools in Chromium outputs:

/api/v1/upload/:1 

       Failed to load resource: net::ERR_ACCESS_DENIED
jkppr commented 2 weeks ago

So I tried to reproduce your issue following the steps outlined in your first comment, but it works for me with not issues on a fresh Ubuntu 22.04 VM and a Plaso file of a windows disk image (Therefore changing this from Bug to Support).

The versions you have provided look good as well.

Some additional pointers that can maybe help: