I've made two changes to improve performance of ReClass when analyzing processes with lots of memory regions.
Summary -
Changed memory map to std::map from vector
Changed module list to std::map from vector
Changed lookups for both finding pages/addresses in the memory map and finding modules by address to use the std::map lookup instead of iterating through the whole vector. These changes took my draw time from ~3200ms per frame to ~28ms per frame.
Instead of walking the entire memory space of the target process on a 5 second timer to generate the memory map, it is now done incrementally, with ~10ms of work being done before yielding and picking up where it left off the next time the UpdateMemoryMap timer fires (every 30ms). This got rid of the stutter that would happen every 5 seconds when the memory map was updated.
Details:
**** Memory Map Address Lookup Changes ***
ReClass maintains a model of the memory map of the target process. It uses VirtualQueryEx to get information on memory regions and stores that information in a vector.
When drawing a cell that has no type applied to it, Reclass interprets the value at the cell as an address and, if the address is mapped in the target process, it prints the address at the end of the cell highlighted in red. It determines whether an address is mapped by using that memory map model. The relevant code is:
BOOLEAN IsMemory( ULONG_PTR Address )
{
for (size_t i = 0; i < g_MemMap.size( ); i++)
{
if ((Address >= g_MemMap[i].Start) && (Address <= g_MemMap[i].End))
return true;
}
return false;
}
It iterates through each region in the memory map model and checks if the address falls within that region. In my case, I had ~280,000 memory regions in the model so each call to IsMemory was brutal. If I made a class with more than a handful of base cells in it, I was getting 3 second render times which basically made ReClass unusable for me.
My solution to this was to change g_MemMap from a vector to a std::map keyed on the last byte of each region. Then the best match region can be looked up with a call to g_MemMap->lower_bound(), and a comparison of the region base address against the address to check if Address is actually within the region.
std::map lookups are O(logn) while the vector crawl is obviously O(n). This is almost certainly not a big deal for more reasonable size memory maps but it made a huge difference for my 280k example. As stated above, my frame draw time went from ~3200ms to ~28ms with this change alone.
**** Memory Map Generation Changes ***
The next hiccup was that every 5 seconds when Reclass would re-generate the target process memory map, ReClass would hang for upwards of a second. This makes sense - 280k calls to VirtualQueryEx is expensive.
My solution to this was to generate the memory map a little bit at a time and spread the work out across a couple of seconds for a smoother frame rate. I added UpdateMemoryMapIncremental, which works on crawling the target memory map for ~10ms at a time. Once the whole memory map has been captured, it's dumped into g_MemMap and the process starts over.
On the initial frame (no mem map info yet), it will run through the entire memory map non stop so that the user doesn't have to wait for pointer highlighting. This eliminated the 5 second hang up, and I haven't noticed any mem map lagging or side effects as a result.
I made the same change to the module mapping as the memory map (changed from vector to std::map) but didn't do any beyond that because they seemed to not make a noticeable difference.
**** Smaller Changes ***
A few places walked the memory map, I changed those to use IsMemory() instead. Likewise for a few places that walked the Modules list on their own, I changed those to use GetModule.
I've made two changes to improve performance of ReClass when analyzing processes with lots of memory regions.
Summary -
Details:
**** Memory Map Address Lookup Changes *** ReClass maintains a model of the memory map of the target process. It uses VirtualQueryEx to get information on memory regions and stores that information in a vector.
When drawing a cell that has no type applied to it, Reclass interprets the value at the cell as an address and, if the address is mapped in the target process, it prints the address at the end of the cell highlighted in red. It determines whether an address is mapped by using that memory map model. The relevant code is:
It iterates through each region in the memory map model and checks if the address falls within that region. In my case, I had ~280,000 memory regions in the model so each call to IsMemory was brutal. If I made a class with more than a handful of base cells in it, I was getting 3 second render times which basically made ReClass unusable for me.
My solution to this was to change g_MemMap from a vector to a std::map keyed on the last byte of each region. Then the best match region can be looked up with a call to g_MemMap->lower_bound(), and a comparison of the region base address against the address to check if Address is actually within the region.
std::map lookups are O(logn) while the vector crawl is obviously O(n). This is almost certainly not a big deal for more reasonable size memory maps but it made a huge difference for my 280k example. As stated above, my frame draw time went from ~3200ms to ~28ms with this change alone.
**** Memory Map Generation Changes *** The next hiccup was that every 5 seconds when Reclass would re-generate the target process memory map, ReClass would hang for upwards of a second. This makes sense - 280k calls to VirtualQueryEx is expensive.
My solution to this was to generate the memory map a little bit at a time and spread the work out across a couple of seconds for a smoother frame rate. I added UpdateMemoryMapIncremental, which works on crawling the target memory map for ~10ms at a time. Once the whole memory map has been captured, it's dumped into g_MemMap and the process starts over.
On the initial frame (no mem map info yet), it will run through the entire memory map non stop so that the user doesn't have to wait for pointer highlighting. This eliminated the 5 second hang up, and I haven't noticed any mem map lagging or side effects as a result.
I made the same change to the module mapping as the memory map (changed from vector to std::map) but didn't do any beyond that because they seemed to not make a noticeable difference.
**** Smaller Changes *** A few places walked the memory map, I changed those to use IsMemory() instead. Likewise for a few places that walked the Modules list on their own, I changed those to use GetModule.