NationalSecurityAgency / ghidra

Ghidra is a software reverse engineering (SRE) framework
https://www.nsa.gov/ghidra
Apache License 2.0
49.09k stars 5.65k forks source link

MDMangGhidra OOM #6586

Closed codeno closed 1 month ago

codeno commented 1 month ago

reproduce with:

new MDMangGhidra().demangle("?A0x481a9c16.TypeMatch<struct__s_HandlerType,struct__s_CatchableType_const_,struct__s_ThrowInfo_const_>", false);

symbol taken from mslispr.dll of mathcad 15 mslispr.zip

ghizard commented 1 month ago

I've been coming across a number of symbols with dotted notation which might take a while to get right, as there may be a mix of many concepts involved. It will take a bit of study over time, but this doesn't mean we cannot come up with preliminary solutions that might change over time until we get this correct.

For the particular symbol you mention, the dot seems to separate an mangled anonymous namespace from a symbol that is not mangled.

Your file also contains this symbol that starts with a mangled anonymous namespace and a dot, but it is followed followed by what I would declare as a standard mangled symbol:

?A0x49040904.??__F?_lock@AtExitLock@<CrtImplementationDetails>@@$$Q0V?$Handle@P$AAVObject@System@@@2@A@@YMXXZ

Back to the original symbol... unfortunately, Ghidra cannot handle symbols with spaces in their names. This was a decision made before my time and something that I've poked at a number of times to try to rememdy, but there are bigger underlying issues that come into play. So the symbol that you have, a number of spaces have already been replaced with underscores.

I suspect that the original would have been something like:

?A0x481a9c16.TypeMatch<struct _s_HandlerType,struct _s_CatchableType const ,struct _s_ThrowInfo const >

In this situation, the loader is placing the symbol at the address after calling a method that standardizes the symbol (replacing the spaces with underscores).

The Demangler (whether the Demangler Analyzer or your script call) probably would not have too much issue with the spaces already being replaced, but I'd have to ensure that the standard demangler processing gets bypassed on the portion after the dot, as it might not like the angle brackets and such.

I would suspect that we could just output this (the anonymous namespace gets processed by the standard Demangler logic and the portion after the dot is not):

`anonymous namespace'.TypeMatch<struct__s_HandlerType,struct__s_CatchableType_const_,struct__s_ThrowInfo_const_>

or

`anonymous namespace'::TypeMatch<struct__s_HandlerType,struct__s_CatchableType_const_,struct__s_ThrowInfo_const_>
ghizard commented 1 month ago

I have a quick solution that will address the OOM. The need for proper processing of the dotted symbols is a known issue and will not be addressed with this solution.