Implement support for compressed long values

vicodark commented 3 years ago

I've tried to run eseparser to dump the SystemIndex_PropertyStore from a few Windows.edb files. Every time, most of the string data come out Chinese or similar as seen here:

{"WorkID":40,"27F-System_Search_Rank":707406378,"14F-System_FileAttributes":707406378,"4440-System_ItemFolderPathDisplay":"尕绮諚檵夭ᖬ둮처錶炵岵圌엃굠ౙ鬖拍�ᖭ솃絢献淊잦淼뮦ﲱ缽ﲹ�碼왆얉㋷๷쬏懩࠷麝�㳲","

I tried digging around in the code and it looks like the taggedItems buffers returned by ParseTaggedValues for these Long Text columns does not hold the string data at all. A random selection of the data stored there hex encoded looks like this: 10fb692bd6aab564b156ac96bbd16232db4cd6c2d572315c0d1783b56631586c368bd966b7560c068bf501

Any idea what's going on here?

vicodark commented 3 years ago

Well you can just go ahead and ignore the above and have a good laugh at the fact that I forgot that Windows.edb can have compressed strings.

scudette commented 3 years ago

Is there a way we can automatically figure out it is compressed and decompress it?

vicodark commented 3 years ago

esedbexport in libesedb does it. I didn't know till today that esedbexport does some artifact-specific processing in the tool itself for SRUM, Windows.edb, and others. For Windows.edb, it appears many of the strings are compressed by one of a few compression algos and also obfuscated with some simple bitbashing stuff, all of which esedbexport knows how to decode.

scudette commented 3 years ago

Ok lets take a look at what esedbexport does and match the specific artifact processing if possible.

scudette commented 8 months ago

If you can share a sample file (even privately) we can implement support for compressed values. We have the source code for the ese release by Microsoft so it is much easier to figure out

Velocidex / go-ese

Implement support for compressed long values #8