mandiant / GoReSym

Go symbol recovery tool
MIT License
498 stars 62 forks source link

Fix: Redundant read ELF sections cause memory exhaustion #47

Closed khanhtaskymaviscom closed 6 months ago

khanhtaskymaviscom commented 7 months ago

The function pcln_scan() does not stop reading section data upon encountering it, potentially leading to excessive memory consumption. I conducted a test using an ELF binary of approximately 100 MB, which was terminated by the OOM killer. This PR aims to address the issue

google-cla[bot] commented 7 months ago

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

stevemk14ebr commented 6 months ago

DataAfterSection works this way intentionally. There are obfuscators and packers which break what is usually a singular segment into multiple smaller ones. For successful type recovery we are required to re-assemble these segments. We can't know which segments are split, so we have to start at one section, and merge all sections after. Yes, this does cause out of memory errors in pathological cases, which is sad, but it is what it is.

I'd be more interested in a solution that keeps the section scanning logic the same, but optimizes the caching behavior. The real memory pressure occurs because we cache these large chunks of re-assembled segments in memory. If you want to look into this, please open a new PR.