simsong / bulk_extractor

This is the development tree. Production downloads are at:
https://github.com/simsong/bulk_extractor/releases
Other
1.07k stars 185 forks source link

libewf crashes when out-of-memory #331

Closed simsong closed 2 years ago

simsong commented 2 years ago

I have replicated this several times now. No crash with .raw files, nor with small .E01 files, but definitely with large ones.

First crash:

10:43:45 Offset 642366MB (34.46%) Done in  4:21:37 at 2022-01-23 15:05:22
10:43:46 Offset 642366MB (34.46%) Done in  4:21:37 at 2022-01-23 15:05:23
10:43:47 Offset 642366MB (34.47%) Done in  4:21:37 at 2022-01-23 15:05:24
10:43:48 Offset 642366MB (34.47%) Done in  4:21:34 at 2022-01-23 15:05:22
10:43:49 Offset 642366MB (34.49%) Done in  4:21:25 at 2022-01-23 15:05:14
zsh: killed     ./src/bulk_extractor -1 -Z -o /Volumes/out/arm-be20-2TB-v1

Second crash, this one under a debugger:

13:44:33 Offset 546970MB (29.76%) Done in  4:17:52 at 2022-01-23 18:02:25
13:44:34 Offset 546970MB (29.76%) Done in  4:17:50 at 2022-01-23 18:02:24
13:44:35 Offset 546970MB (29.77%) Done in  4:17:45 at 2022-01-23 18:02:20
13:44:36 Offset 546970MB (29.78%) Done in  4:17:41 at 2022-01-23 18:02:17
Process 29395 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGKILL
    frame #0: 0x00000001a0bf815c libsystem_platform.dylib`_platform_memmove + 76
libsystem_platform.dylib`_platform_memmove:
->  0x1a0bf815c <+76>: stnp   q2, q3, [x0]
    0x1a0bf8160 <+80>: subs   x2, x2, #0x40             ; =0x40
    0x1a0bf8164 <+84>: b.ls   0x1a0bf8180               ; <+112>
    0x1a0bf8168 <+88>: stnp   q0, q1, [x3]
Target 0: (bulk_extractor) stopped.
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGKILL
  * frame #0: 0x00000001a0bf815c libsystem_platform.dylib`_platform_memmove + 76
    frame #1: 0x00000001008cb5e4 libewf.2.dylib`libewf_handle_read_buffer + 628
    frame #2: 0x00000001008cb6cc libewf.2.dylib`libewf_handle_read_random + 80
    frame #3: 0x000000010006cc1c bulk_extractor`process_ewf::pread(this=0x0000600002610000, buf=<unavailable>, bytes=<unavailable>, offset=<unavailable>) const at image_process.cpp:276:15 [opt]
    frame #4: 0x000000010006cdac bulk_extractor`process_ewf::sbuf_alloc(this=0x0000600002610000, it=0x000000016fdfd428) const at image_process.cpp:337:28 [opt]
    frame #5: 0x0000000100072ac8 bulk_extractor`Phase1::get_sbuf(this=0x000000016fdfd680, it=0x000000016fdfd428) at phase1.cpp:84:22 [opt]
    frame #6: 0x0000000100073628 bulk_extractor`Phase1::read_process_sbufs(this=0x000000016fdfd680) at phase1.cpp:191:37 [opt]
    frame #7: 0x0000000100074de8 bulk_extractor`Phase1::phase1_run(this=0x000000016fdfd680) at phase1.cpp:298:5 [opt]
    frame #8: 0x0000000100052568 bulk_extractor`bulk_extractor_main(cout=<unavailable>, cerr=<unavailable>, argc=<unavailable>, argv=<unavailable>) at bulk_extractor.cpp:595:16 [opt]
    frame #9: 0x00000001005bd0f4 dyld`start + 520
(lldb)

Since the crashes happen at different points in the run, and since the runs are non-deterministic due to muli-threading, I suspect that these are memory alignment/overrun/allocation errors that show up on the ARM architecture but not on the Intel architecture. (I saw a similar issue with BE16, see #328 )

simsong@Seasons bulk_extractor % ewfinfo -v
ewfinfo 20140812

Presumably @joachimmetz will want to know about this.

joachimmetz commented 2 years ago

Is access to libewf thread locked or is used from multiple threads?

simsong commented 2 years ago

libewf is only accessed from the main thread.

simsong commented 2 years ago

Recall that this is not a problem on Intel with the same disk image and program, only a problem on Apple silicon.

simsong commented 2 years ago

Here is another possible datapoint. Perhaps the system ran out of memory?

How does libewf deal with low memory conditions? bulk_extractor detects that memory allocation failed and handles it appropriately.

simsong@Seasons bulk_extractor % df -h .
Filesystem     Size   Used  Avail Capacity iused      ifree %iused  Mounted on
/dev/disk3s5  926Gi  628Gi  183Gi    78% 1765601 1918589840    0%   /System/Volumes/Data
simsong@Seasons bulk_extractor %
simsong@Seasons bulk_extractor % ps uwww 29395
USER      PID  %CPU %MEM      VSZ    RSS   TT  STAT STARTED      TIME COMMAND
simsong 29395   0.0  0.0 567558496    608 s000  TX   11:55AM 851:44.73 /Users/simsong/gits/bulk_extractor/src/bulk_extractor -1 -Z -o /Volumes/out/arm-be20-2TB-v2 /Users/simsong/corp/nps-2011-2tb/nps-2011-2tb.E01
simsong@Seasons bulk_extractor %
joachimmetz commented 2 years ago

How does libewf deal with low memory conditions?

similar, if an alloc or realloc fails it "should" be handled gracefully, but an error is easily made. The trace indicates an issue in memmove (which I assume is an optimization of the memcpy in the libewf_handle_read_buffer function). If time permits I'll have a look at it tomorrow with a fresh pair of eyes.

simsong commented 2 years ago

Definitely out of memory.

Screen Shot 2022-01-23 at 4 05 05 PM
simsong commented 2 years ago

I'm closing this. BE shoudln't let memory get out of control, and this is not a problem specific to libewf. Sorry for the false alarm, @joachimmetz .