Memory Search - Githubissues

Zhentar commented 6 years ago

I've forked DbgShell and started putting together a basic memory search command (which hopefully I will be able to polish into a reasonable pull request before the ADHD decides otherwise for me). I wanted to share some thoughts and get some input.

I've forgone DbgEng's search granularity option so that I could implement search alignment independent of the search size, to allow things like searching for pointers to nearby addresses:
```
> Search-DbgMemory 04244c -SearchValueLengthInBytes 3 -SearchResultAlignment 4 -FromAddress 1 | % { $_.Address - 1 } | Read-DbgMemory -LengthInBytes 16
719b1c40  04244c8b 060441f7 b8000000 00000001
719d8df8  04244c13 042444dd c310c483 fe4356e9
71a1e99c  04244c8b 04c231d9 244c8b00 c221d904
```
On the one hand, awesome!, you're not going to be doing that in WinDbg... on the other hand, that's not a very straightforward approach and byte granularity means you're search for pointers in a 256 byte region or 64kB region, no in between... any thoughts on a better way to do it?
I'm taking the search value as a ulong, which means it caps out at 8 bytes... supporting strings seems easy enough, but I've no idea how to tackle 9+ byte non-string patterns. Are there any existing commands I can crib from?

My ultimate goal is to do something like this:

[Heap 007f0000 segment 52800000 (msvcrt!_crtheap)]
52808dec  04244c8b e808508d fff61480 cc0004c2
[<unknown>]
658d0160  04244c8d 04244489 8b4ceca1 24448965
[srvcli; "C:\Windows\System32\srvcli.dll"]
719b1c40  04244c8b 060441f7 b8000000 00000001
719d8df8  04244c13 042444dd c310c483 fe4356e9
71a1e99c  04244c8b 04c231d9 244c8b00 c221d904

Is there a way I can do grouping without accumulating?

jazzdelightsme commented 6 years ago

Cool!

For 1) and 2) I am not very familiar with dbgeng's memory searching... I'll have to read up on that. By far the best memory searching functionality I've seen were in a debugger extension called !pde, but I'm not sure if that's available externally. I'll look into it.

For 3) Can grouping be done without accumulating? Yes, the grouping done by the alternate formatting engine in DbgShell operates on a streaming basis--it evaluates each item against the grouping criteria, and when it comes back different, it's a new group.

jazzdelightsme commented 6 years ago

Here's where you can get PDE: https://channel9.msdn.com/Shows/Defrag-Tools -> follow link to OneDrive for downloads. Here is the video about string searching.

The source is not available externally unfortunately.

Zhentar commented 6 years ago

Just had need of it and I am indeed rather impressed with the string searching. Don't suppose you could get me the public debug symbols for it, nearly as good as source 😉

Zhentar commented 6 years ago

Getting closer to what I want :)

>Search-DbgMemory 04244c8b | Read-DbgMemory -LengthInBytes 4
VirtualAlloc 75b10000 - 75d6c000  MEM_IMAGE  combase
75b6ade4  04244c8b

VirtualAlloc 75f50000 - 760dd000  MEM_IMAGE  user32
75f8df60  04244c8b
75f8e180  04244c8b

VirtualAlloc 76170000 - 7628e000  MEM_IMAGE  ucrtbase
761b68e0  04244c8b
761b6ed0  04244c8b
761b6f40  04244c8b

(Hmmm, only just now occurred to me that "VirtualAlloc" is not a very good label for MEM_IMAGE/MEM_MAPPED regions...)

(BTW GroupByResultIsDifferent doesn't really work as intended; == on two object references will only test reference equality. https://github.com/Zhentar/DbgShell/commit/f212ff00385005e8dcf909fdd16f950f8d275737 has a rewrite of it, including sequence comparison)

Zhentar commented 6 years ago

Now this is looking a lot like what my heart desires 😁

> Search-DbgMemory 010203a0 | Read-DbgMemory -LengthInBytes 4
VirtualAlloc 007f0000 - 008f0000  MEM_PRIVATE  Heap 007f0000
Heap entry body 008242f0 size 0x2000 Busy
00825d60  010203a0
008ce458  010203a0

VirtualAlloc 06c70000 - 06d70000  MEM_PRIVATE  Heap 007f0000
Heap entry body 06cc9340 size 0x690 Busy
06cc9348  010203a0
06cca1e8  010203a0

Zhentar commented 6 years ago

I was wondering how !PDE.spx manages to be so much faster than SearchVirtual2, so I took a look at how they work.

PDE.spx fetches one page at a time using ReadVirtual (filtering by virtual region attributes), casts to a pointer-size array, and indexes through that looking for matches.
SearchVirtual2 fetches one page at a time using ReadVirtual, searching for matching byte patterns and rejecting matches with inappropriate alignment.

Hmmm, so my desire to "play nice" and use SearchVirtual2 led me to wrap it in a second layer of exactly the same weaknesses. Meanwhile, PDE.spx takes exactly the same approach my fuzzy-searching prototype used, and that prototype was both easier & more intuitive to use and more capable than my first Search-DbgMemory attempt.

Which leads me to conclude that PDE achieves it's much better search experience because it rightly separates searching into two distinct tasks: aligned power-of-2 byte sized searches, and arbitrary size byte/character array searches. And also that using ReadVirtual to read page sized blocks and search them rather than using SearchVirtual2 is a totally reasonable and well performing approach.

So, my start on round 2:

> Search-DbgMemory 010203a0 -SearchMask 0xFFFFFF03
VirtualAlloc 007f0000 - 008f0000  MEM_PRIVATE  Heap 007f0000
Heap entry body 008242f0 size 0x2000 Busy
00825d60  010203a0                             ....

VirtualAlloc 52010000 - 5274e000  MEM_IMAGE  System_Xml_ni
5235d714  0102036b                             k...
5269344c  01020390                             ....

microsoft / DbgShell

Memory Search #45