malwarefrank / dnfile

Parse .NET executable files.
MIT License
74 stars 16 forks source link

Process strings, user_strings, GUIDs, etc. at time of load #51

Open R3MRUM opened 2 years ago

R3MRUM commented 2 years ago

Submitting a request to have things like strings, user_strings, and GUIDs processed when dnfile first loads an executable. Basically implementing the code provided in the following example into dnfile:

https://github.com/malwarefrank/dnfile/blob/b2a24c5eb46995a739c7bb5f626d6f4052ccb753/examples/dnstrings.py

It would be great if the extracted strings could then be simply referenced by the user via a property like dnfile.net.user_strings, which would return a set of extracted user strings.

malwarefrank commented 2 years ago

Maybe. It is already easy to access a stream by name with the dnfile.net.streams dict. The mdtables shortcut exists, but that is a more complicated stream than the others, so I felt like it deserved its own shortcut.

I could implement an iterator for each of the base.ClrHeap streams (Strings, User Strings, Blob, and GUID.. However it is not a straightforward process. GUIDs are referenced by number, and each its the same size, to that would be easy. All the others are referenced by byte offset into the stream. This means that all of those strings can contain junk bytes in between the true Strings and we have no way to tell that by just iterating over the stream; it would require parsing all of the code looking for references, which is beyond the scope of this project at this time :)

Maybe the quick-n-easy solution is best for now: make the heap-type streams iterable and let any exceptions occur and require to be caught by the program doing the iterating.

I will think about this some more, but welcome others thoughts.