Open alk opened 5 days ago
As you say, this seems complicated. It seems that we could have something like
If PC is 9, we will find entry 3, but we want to find entry 1. I'm not sure how best to handle that case.
Yes, that is the complication I was referring to above where my approach wouldn't work. Still, I think it is strictly more general that existing code.
As for proper fix. If we can assume that only nesting is possible without overlaps, then I'd "simply" introduce extra field to entry that points to it's direct "parent" entry. Parent entry being entry that nests "on top". We'd then add simple linear pass after sorting to establish those links.
Here is my attempt to address this: https://github.com/alk/gperftools/commit/e1d65de6805b0aac689c19d7ec1e8de16c88b048 using logic I described above.
Let me know what you think. It seems to work (I can indeed see cases like https://github.com/ianlancetaylor/libbacktrace/issues/137#issuecomment-2391991779 in my testing). And I managed to convince myself it is "quite obvious" the right thing to do. But we should surely something better. Anything I can to do somehow cover this logic by tests ?
Thanks for the patch. I'm a little concerned by the increased memory usage. A large program can have a lot of unit_addrs. It would be one thing if the new pointer were often non-NULL, but it seems to me that it will almost always be NULL.
Sure. We can then have separate array of ranges simply for "top level" entries. I.e. entries without parent and with at least one child. Then when dealing with normal entries we'll simply scan backwards until the start of matching top-level entry (if any). I am happy to try that too. What do you think?
Maybe? Not sure.
okay I'll give it a shot.
This https://github.com/alk/gperftools/commit/d2795cdb214917c1ed3223ade0d0a200883630e2
Of course it'll need at least some cleaning (I know my coding style doesn't always match this file's; there is debugging stuff left; variable names etc), but as a proof of concept it should do.
Indeed at least in my test program number of top entries is quite small. 116 in total (when built ./configure CXX='g++ -no-pie' CXXFLAGS='-O0 -fno-inline -ggdb3' CFLAGS='-O0 -fno-inline -ggdb3' --disable-shared)
Let me know what you think.
Hi. I faced occasional symbolization failures. They tend to occur when building with less optimizations (-O1 -fno-inline -ggdb3 in my case).
When stepping in debugger I found that we're reaching "found_entry = 0" case in dwarf_lookup_pc. Thankfully, git blame pointed me at #44, which indeed seems relevant.
My entries end up like this:
I.e. entry with highest suitable low bound, has too low high bound, but entry immediately prior to that has lower low but high enough high. I.e. previous entry is the entry we need, but it isn't at exactly same low bound as current.
I am not sure what are the possibilities w.r.t. entries nesting or overlapping. But this patch seems to be more general than your code (which requires previous entry to be exactly at same low as current in order to step to that previous entry).
Because I am not sure about nesting (or even overlapping) possibilities, a so I cannot be sure if my patch is truly correct. But in some sense it is "more correct" than your current approach. Otherwise might need some more elaborate fix (like establishing proper parent-child links).
Let me know what you think and if my patch is okay, how you want this to be sent (pull request, or just patch above is fine etc). Also if/how we can somehow cover it by unit tests.