richardlehane / siegfried

signature-based file format identification
http://www.itforarchivists.com/siegfried
Apache License 2.0
223 stars 30 forks source link

Memory Leak on File #171

Closed gleporeNARA closed 2 years ago

gleporeNARA commented 2 years ago

When attempting to scan the attached file (unzipped), my computer rapidly uses up all 16GB of memory, plus the swap, leading to a computer crash. Can you verify?

Operating System: KDE neon 5.23 KDE Plasma Version: 5.23.3 KDE Frameworks Version: 5.88.0 Qt Version: 5.15.3 Kernel Version: 5.4.0-90-generic (64-bit) Graphics Platform: X11 Processors: 12 × AMD Ryzen 5 2600 Six-Core Processor Memory: 15.6 GiB of RAM Graphics Processor: GeForce GT 1030/PCIe/SSE2

Thumbs.db.zip

richardlehane commented 2 years ago

Thanks for reporting this Greg. This seems to be an issue with github.com/richardlehane/mscfb, the library I wrote to unpack OLE files.

gleporeNARA commented 2 years ago

Here's another file exhibiting the same behavior.

quake2.suo.zip

ross-spencer commented 2 years ago

This won't be much help but MSCFB is getting stuck infinitely looping at setMiniStream(), with findNext() looking like it is unable to pinpoint an end of chain signal (sector number?) (the setMiniStream()exit condition) and the same buffers always being returned in readAt(). The list of mini stream sector numbers will simply keep growing until the memory maxes out.

In Thumbs.db it's dirs 10 and 12:

2021/12/30 07:36:43 [10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6 0 2 0 255 255 255 255 255 255 255 255 255 255 255 255 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 222 0 0 0 39 2 0 0 0 0 0 0 56 0 50 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6 0 2 1 255 255 255 255 255 255 255 255 255 255 255 255 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 231 0 0 0 128 9 0 0 0 0 0 0 57 0 50 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6 0 2 1 255 255 255 255 2 0 0 0 255 255 255 255 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 10 1 0 0 113 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 255 255 255 255 255 255 255 255 255 255 255 255 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
2021/12/30 07:36:43 [12 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6 0 2 0 255 255 255 255 255 255 255 255 255 255 255 255 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 222 0 0 0 39 2 0 0 0 0 0 0 56 0 50 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6 0 2 1 255 255 255 255 255 255 255 255 255 255 255 255 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 231 0 0 0 128 9 0 0 0 0 0 0 57 0 50 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6 0 2 1 255 255 255 255 2 0 0 0 255 255 255 255 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 10 1 0 0 113 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 255 255 255 255 255 255 255 255 255 255 255 255 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
2021/12/30 07:36:43 [10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6 0 2 0 255 255 255 255 255 255 255 255 255 255 255 255 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 222 0 0 0 39 2 0 0 0 0 0 0 56 0 50 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6 0 2 1 255 255 255 255 255 255 255 255 255 255 255 255 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 231 0 0 0 128 9 0 0 0 0 0 0 57 0 50 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6 0 2 1 255 255 255 255 2 0 0 0 255 255 255 255 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 10 1 0 0 113 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 255 255 255 255 255 255 255 255 255 255 255 255 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]

And quake2.suo it's 10, 11, 12:

2021/12/30 07:41:07 [10 0 0 0 102 0 95 0 103 0 108 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 14 0 2 1 19 0 0 0 18 0 0 0 255 255 255 255 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 101 0 0 0 248 5 0 0 0 0 0 0 84 0 97 0 115 0 107 0 76 0 105 0 115 0 116 0 83 0 104 0 111 0 114 0 116 0 99 0 117 0 116 0 115 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 36 0 2 0 255 255 255 255 255 255 255 255 255 255 255 255 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 29 0 0 0 8 0 0 0 0 0 0 0 73 0 86 0 83 0 77 0 68 0 68 0 101 0 115 0 105 0 103 0 110 0 101 0 114 0 83 0 101 0 114 0 118 0 105 0 99 0 101 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 42 0 2 1 1 0 0 0 12 0 0 0 255 255 255 255 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 254 255 255 255 0 0 0 0 0 0 0 0 83 0 111 0 117 0 114 0 99 0 101 0 67 0 111 0 100 0 101 0 67 0 111 0 110 0 116 0 114 0 111 0 108 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 36 0 2 1 14 0 0 0 22 0 0 0 255 255 255 255 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 169 0 0 0 44 0 0 0 0 0 0 0]
2021/12/30 07:41:07 [11 0 0 0 102 0 95 0 103 0 108 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 14 0 2 1 19 0 0 0 18 0 0 0 255 255 255 255 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 101 0 0 0 248 5 0 0 0 0 0 0 84 0 97 0 115 0 107 0 76 0 105 0 115 0 116 0 83 0 104 0 111 0 114 0 116 0 99 0 117 0 116 0 115 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 36 0 2 0 255 255 255 255 255 255 255 255 255 255 255 255 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 29 0 0 0 8 0 0 0 0 0 0 0 73 0 86 0 83 0 77 0 68 0 68 0 101 0 115 0 105 0 103 0 110 0 101 0 114 0 83 0 101 0 114 0 118 0 105 0 99 0 101 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 42 0 2 1 1 0 0 0 12 0 0 0 255 255 255 255 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 254 255 255 255 0 0 0 0 0 0 0 0 83 0 111 0 117 0 114 0 99 0 101 0 67 0 111 0 100 0 101 0 67 0 111 0 110 0 116 0 114 0 111 0 108 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 36 0 2 1 14 0 0 0 22 0 0 0 255 255 255 255 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 169 0 0 0 44 0 0 0 0 0 0 0]
2021/12/30 07:41:07 [12 0 0 0 102 0 95 0 103 0 108 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 14 0 2 1 19 0 0 0 18 0 0 0 255 255 255 255 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 101 0 0 0 248 5 0 0 0 0 0 0 84 0 97 0 115 0 107 0 76 0 105 0 115 0 116 0 83 0 104 0 111 0 114 0 116 0 99 0 117 0 116 0 115 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 36 0 2 0 255 255 255 255 255 255 255 255 255 255 255 255 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 29 0 0 0 8 0 0 0 0 0 0 0 73 0 86 0 83 0 77 0 68 0 68 0 101 0 115 0 105 0 103 0 110 0 101 0 114 0 83 0 101 0 114 0 118 0 105 0 99 0 101 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 42 0 2 1 1 0 0 0 12 0 0 0 255 255 255 255 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 254 255 255 255 0 0 0 0 0 0 0 0 83 0 111 0 117 0 114 0 99 0 101 0 67 0 111 0 100 0 101 0 67 0 111 0 110 0 116 0 114 0 111 0 108 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 36 0 2 1 14 0 0 0 22 0 0 0 255 255 255 255 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 169 0 0 0 44 0 0 0 0 0 0 0]
2021/12/30 07:41:07 [10 0 0 0 102 0 95 0 103 0 108 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 14 0 2 1 19 0 0 0 18 0 0 0 255 255 255 255 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 101 0 0 0 248 5 0 0 0 0 0 0 84 0 97 0 115 0 107 0 76 0 105 0 115 0 116 0 83 0 104 0 111 0 114 0 116 0 99 0 117 0 116 0 115 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 36 0 2 0 255 255 255 255 255 255 255 255 255 255 255 255 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 29 0 0 0 8 0 0 0 0 0 0 0 73 0 86 0 83 0 77 0 68 0 68 0 101 0 115 0 105 0 103 0 110 0 101 0 114 0 83 0 101 0 114 0 118 0 105 0 99 0 101 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 42 0 2 1 1 0 0 0 12 0 0 0 255 255 255 255 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 254 255 255 255 0 0 0 0 0 0 0 0 83 0 111 0 117 0 114 0 99 0 101 0 67 0 111 0 100 0 101 0 67 0 111 0 110 0 116 0 114 0 111 0 108 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 36 0 2 1 14 0 0 0 22 0 0 0 255 255 255 255 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 169 0 0 0 44 0 0 0 0 0 0 0]

After that, I've been wrestling with the Microsoft MSCFB docs (and a few other reverse engineering docs) to find out if there is anything missing in how the data should be processed and what data we have collected that also helps process it. In the docs is it accurate that at this point in the stream the data is a "Compount File User-Defined Data Sector' link?, and also a directory entry? At 512 bytes the data above can be read into those okay, and the data all looks consistent, but it doesn't really reveal anything and I'm not sure is correct.

I've tried looking at how https://github.com/decalage2/oletools (and its parent library olefile) pulls the OLE2 apart but as I can't pinpoint the right terminology, I'm not sure if it is looking specifically at these mini streams in any of its code.

gleporeNARA commented 2 years ago

Thanks for taking a look at this. Those are the only two examples out of millions of files I've tested, so it's definitely a corner case! What about limiting the total amount of memory siegfried can use on examining one file? Perhaps as a multiple of file size or something. These are both very small files. Either way, it's not a show stopper now that I'm aware it can happen. Supposedly my Linux box should kill runaway processes before they take down the machine, but that doesn't seem to be happening.

richardlehane commented 2 years ago

I took a look at this over the weekend. It's very much as Ross had said: looping in the chain traversal code for the mini stream. We could blame these files that Greg shared as they seem to have repeating (i.e. bad) values in their sector chains. E.g. the value 0A 00 00 00 in the Thumbs.db file at offset 560 this repeats a value already in that chain, causing a loop in that file. But the real culprit is my code as I should be guarding against cycles and returning an error, not entering an infinite loop :(

I've raised an issue in the mscfb repository: https://github.com/richardlehane/mscfb/issues/12. A fix will be soon and I'll include in next sf release

richardlehane commented 2 years ago

I made updates to the MSCFB package to detect cycles in the ministream directory in order to avoid panics for files malformed in this way. This should prevent siegfried from failing on these files. However because of this error in the ministream they won't be parsed correctly which may prevent identification.

The fix is in the latest release of siegfried: https://github.com/richardlehane/siegfried/releases/tag/v1.9.2. I'll update the debian repository tonight.

Thanks again for reporting this Greg.