log2timeline / plaso

Super timeline all the things
https://plaso.readthedocs.io
Apache License 2.0
1.74k stars 352 forks source link

Processing doesn't finish with large E01 - version 20240826 #4924

Open benjamindonnachie opened 3 days ago

benjamindonnachie commented 3 days ago

I am trying to run plaso across a large Windows image (740GB of E01s) but it fails to complete with worker threads shown as killed or idle for an extended period:

plaso - log2timeline version 20240826

Source path             : /data/xxxx.E01
Source type             : storage media image
Processing time         : 7 days, 20:48:52

Tasks:          Queued  Processing      Merging         Abandoned       Total
                10002   5               0               23              666632

Identifier      PID     Status          Memory          Sources         Event Data      File
Main            7       running         2.0 GiB         800602 (0)      7758570 (0)
Worker_00       11      killed          2.0 GiB         206869 (0)      315434 (0)
Worker_01       13      killed          2.0 GiB         175802 (0)      630803 (0)
Worker_02       15      killed          2.0 GiB         127922 (0)      671083 (0)
Worker_03       17      killed          2.0 GiB         77952 (0)       5667842 (0)
Worker_04       19      killed          2.0 GiB         211996 (0)      431259 (0)
Worker_05       32      killed          2.0 GiB         58 (0)          1439 (1)
Worker_06       36      killed          2.0 GiB         0 (0)           3587 (5)
Worker_07       40      idle            1.4 GiB         0 (0)           0 (0)
Worker_08       44      idle            1.4 GiB         0 (0)           0 (0)
Worker_09       48      idle            1.4 GiB         0 (0)           0 (0)
Worker_10       52      killed          2.0 GiB         0 (0)           4774 (8)
Worker_11       56      killed          2.0 GiB         0 (0)           2677 (6)
Worker_12       60      idle            1.4 GiB         0 (0)           0 (0)
Worker_13       64      killed          2.0 GiB         0 (0)           2865 (11)
Worker_14       68      killed          2.0 GiB         0 (0)           2917 (11)
Worker_15       72      killed          2.0 GiB         0 (0)           1636 (11)
Worker_16       76      killed          2.0 GiB         0 (0)           2125 (9)
Worker_17       80      killed          2.0 GiB         0 (0)           2760 (11)
Worker_18       84      killed          2.0 GiB         0 (0)           2282 (5)
Worker_19       88      killed          2.0 GiB         0 (0)           2563 (2)
Worker_20       92      killed          2.0 GiB         0 (0)           1934 (11)
Worker_21       96      killed          2.0 GiB         0 (0)           3456 (6)
Worker_22       100     killed          2.0 GiB         0 (0)           1703 (12)
Worker_23       104     killed          2.0 GiB         0 (0)           960 (8)
Worker_24       108     killed          2.0 GiB         0 (0)           1812 (12)
Worker_25       112     killed          2.0 GiB         0 (0)           1344 (5)
Worker_26       116     killed          2.0 GiB         0 (0)           1315 (9)
Worker_27       120     idle            1.6 GiB         0 (0)           0 (0)

I've tried multiple ways of running; Under ubuntu 24.04 using the provided packages (python3-plaso), under MacOS 15.1 both brew and docker (v4.35.0 - engine v27.3.1). All running plaso version 20240826 and all fail to complete.

The docker command is:

docker run -v "pwd":/data/ log2timeline/plaso:latest log2timeline --vss_stores none --partitions all --volumes all --hashers md5,sha1,sha256 --parsers win7_slow,winxp_slow,text --yara-rules /data/yara-rules-full_20241020.yar --storage_file /data/xxxx.E01_202410_full_yara.plaso /data/xxxx.E01

The machine has 64GB, I've increased the maximum file descriptors to maximum and docker has 64GB RAM and 1TB virtual disk available.

If I run pinfo it reports 'sqlite3.DatabaseError: database disk image is malformed'.

Unfortunately, I cannot share the image as it contains PII.

I will rerun with the debug option shortly after a reboot.

benjamindonnachie commented 1 day ago

Unexpectedly, it completed fine in just over 12 hours using:

docker run -v "pwd":/data/ log2timeline/plaso:latest log2timeline --debug --single-process --vss_stores none --partitions all --volumes all --hashers md5,sha1,sha256 --parsers win7_slow,winxp_slow,text --yara-rules /data/yara-rules-full_20241020.yar --storage_file /data/xxxx.E01_20241124_full_yara.plaso /data/xxxx.E01