Velocidex / go-ese

Go implementation of an Extensible Storage Engine parser
Apache License 2.0
26 stars 12 forks source link

Implement support for compressed long values #8

Open vicodark opened 3 years ago

vicodark commented 3 years ago

I've tried to run eseparser to dump the SystemIndex_PropertyStore from a few Windows.edb files. Every time, most of the string data come out Chinese or similar as seen here:

{"WorkID":40,"27F-System_Search_Rank":707406378,"14F-System_FileAttributes":707406378,"4440-System_ItemFolderPathDisplay":"尕绮諚檵夭ᖬ둮처錶炵岵圌엃굠ౙ鬖拍�ᖭ솃絢献淊잦淼뮦ﲱ缽ﲹ�碼왆얉㋷๷쬏懩࠷麝�㳲","

I tried digging around in the code and it looks like the taggedItems buffers returned by ParseTaggedValues for these Long Text columns does not hold the string data at all. A random selection of the data stored there hex encoded looks like this: 10fb692bd6aab564b156ac96bbd16232db4cd6c2d572315c0d1783b56631586c368bd966b7560c068bf501

Any idea what's going on here?

vicodark commented 3 years ago

Well you can just go ahead and ignore the above and have a good laugh at the fact that I forgot that Windows.edb can have compressed strings.

scudette commented 3 years ago

Is there a way we can automatically figure out it is compressed and decompress it?

vicodark commented 3 years ago

esedbexport in libesedb does it. I didn't know till today that esedbexport does some artifact-specific processing in the tool itself for SRUM, Windows.edb, and others. For Windows.edb, it appears many of the strings are compressed by one of a few compression algos and also obfuscated with some simple bitbashing stuff, all of which esedbexport knows how to decode.

scudette commented 3 years ago

Ok lets take a look at what esedbexport does and match the specific artifact processing if possible.

scudette commented 8 months ago

If you can share a sample file (even privately) we can implement support for compressed values. We have the source code for the ese release by Microsoft so it is much easier to figure out