ajkhoury / ReClassEx

ReClassEx
MIT License
873 stars 175 forks source link

Performance improvements wrt g_MemMap #57

Closed Douggem closed 3 years ago

Douggem commented 3 years ago

I've made two changes to improve performance of ReClass when analyzing processes with lots of memory regions.

Summary -

Details:

**** Memory Map Address Lookup Changes *** ReClass maintains a model of the memory map of the target process. It uses VirtualQueryEx to get information on memory regions and stores that information in a vector.

When drawing a cell that has no type applied to it, Reclass interprets the value at the cell as an address and, if the address is mapped in the target process, it prints the address at the end of the cell highlighted in red. It determines whether an address is mapped by using that memory map model. The relevant code is:

 BOOLEAN IsMemory( ULONG_PTR Address )
{
    for (size_t i = 0; i < g_MemMap.size( ); i++)
    {
        if ((Address >= g_MemMap[i].Start) && (Address <= g_MemMap[i].End))
            return true;
    }
    return false;
}

It iterates through each region in the memory map model and checks if the address falls within that region. In my case, I had ~280,000 memory regions in the model so each call to IsMemory was brutal. If I made a class with more than a handful of base cells in it, I was getting 3 second render times which basically made ReClass unusable for me.

My solution to this was to change g_MemMap from a vector to a std::map keyed on the last byte of each region. Then the best match region can be looked up with a call to g_MemMap->lower_bound(), and a comparison of the region base address against the address to check if Address is actually within the region.

BOOLEAN IsMemory( ULONG_PTR Address )
{
    auto containingBlock = g_MemMap.lower_bound(Address);
    if (containingBlock != g_MemMap.end()
        && containingBlock->second.Start <= Address) {
        return true;
    }
    return false;
}

std::map lookups are O(logn) while the vector crawl is obviously O(n). This is almost certainly not a big deal for more reasonable size memory maps but it made a huge difference for my 280k example. As stated above, my frame draw time went from ~3200ms to ~28ms with this change alone.

**** Memory Map Generation Changes *** The next hiccup was that every 5 seconds when Reclass would re-generate the target process memory map, ReClass would hang for upwards of a second. This makes sense - 280k calls to VirtualQueryEx is expensive.

My solution to this was to generate the memory map a little bit at a time and spread the work out across a couple of seconds for a smoother frame rate. I added UpdateMemoryMapIncremental, which works on crawling the target memory map for ~10ms at a time. Once the whole memory map has been captured, it's dumped into g_MemMap and the process starts over.

On the initial frame (no mem map info yet), it will run through the entire memory map non stop so that the user doesn't have to wait for pointer highlighting. This eliminated the 5 second hang up, and I haven't noticed any mem map lagging or side effects as a result.

I made the same change to the module mapping as the memory map (changed from vector to std::map) but didn't do any beyond that because they seemed to not make a noticeable difference.

**** Smaller Changes *** A few places walked the memory map, I changed those to use IsMemory() instead. Likewise for a few places that walked the Modules list on their own, I changed those to use GetModule.

ajkhoury commented 3 years ago

These are some epic changes, and something I wanted to do myself years ago. Thank you!