Open Andersama opened 2 years ago
Hi! Thanks for your report.
Could you be a bit clearer on what is the problem before discussing possible solutions? What are you doing? What is the version of Rizin you are using? Which OS? Which binary (if possible to share)?
From the additional context
section I'm trying doing (Fedora 34, x86_64, rizin v0.3.0)
$ rizin -d /bin/ls
> /x 00
Searching 1 byte in [0x55f9c8333000-0x55f9c8347000]
hits: 15413
Do you want to print 15413 lines? (y/N)
[...]
And this happens very quicky. Did I understand it right? Are you doing anything different? Thanks again
I built off of whatever's included by default for building cutter on it's default branch. I'm on windows x64 bit. From memory I ran the command to search for 0's by accident across all memory maps when I was attached to inscryption.exe, there's a demo on steam which I can only imagine would do the same thing. I don't have complete context for what rizin would've done in command line, I essentially froze cutter.
We would need few more details to start looking at this. I feel this is more of a "bug report" rather than a "feature request"
We need to understand, at least:
Without the above info there is not much we can do to help you, sorry.
Sure ok, I guess you're asking to be more specific: Windows 10 Home: Version 10.0.19042 Build 19042 (x64)
Here's what I did, or I believe there might be an extra option when debugging a program for "All memory maps", if there's a command which goes byte by byte for 0s you might want to try that instead. I wouldn't necessarily call it a bug, it was more or less just that the results took so long the application went unresponsive, and from memory although it's been a few weeks now it might be the work done to output to console for cutter that takes so long.
The binary I was analyzing was the inscryption game available on steam. There's been an update since, so I don't think I could get you an exact binary to look at, but I'm fairly confident the demo would do a similar thing if you really need it. I'd try making an executable which just loads a lot of 0's into memory and overall check and/or stress test the performance of the search widget when there's a lot of matches.
Is your feature request related to a problem? Please describe. May be more related to cutter's gui because it looks like the api has a maximum search limit / count and perhaps that'd solve the issue in commandline. However, without knowing that in advance you can hang the application by running a search which may have too many results. The issue stems from usage of a list structure which allocates for every saved result in
rz_search_t
.The problem function in question would be
rz_search_hit_new
.Edit: on second look it appears the api is being used with a callback which terminates early and skips the allocation I thought was the original issue. This might impact uses of this part of the api without a provided callback, they're likely slower than it should be. Whatever's causing the application hang eludes me.
Describe the solution you'd like Key part here would be to refactor the search functionality to use a vector like (or probably better rope like) structure. This should improve search timings considerably since time won't be wasted waiting on the system for memory. There appears to be a header for vectors in the library, it probably could be put to use here. A well designed rope structure would probably be better.
EG: instead of
something like:
Since results are addresses there's no need to allocate for every 8 byte address, and since any information related to pointers is auxiliary to the pointer there's and pointers are by definition unique there's plenty of opportunity to use hashtables to grab any interesting auxiliary information about the pointer later. It seems like in addition to the matched address currently a pointer to the search used is saved. There's probably a good amount of memory to save by not storing that pointer for each hit.
Describe alternatives you've considered Another idea would be a custom allocator, seems like that would considerably more difficult, and I suspect a better data structure is the better approach.
Additional context A fairly easy way to test this is to do what I did by accident, which is to debug a program and search for 0 effectively everywhere.
It also looks as if the maxhits variable can be used to reserve space in
rz_search_begin
. This in theory should reduce allocations wherever this api is used.