nim-lang / RFCs

A repository for your Nim proposals.
136 stars 23 forks source link

Demangle Symbols in Debuggers (LLDB, GDB) #540

Open miguelmartin75 opened 1 year ago

miguelmartin75 commented 1 year ago

Summary

Related issue which is closed: https://github.com/nim-lang/Nim/issues/8596

Description

Here are my findings from researching LLDB. I have not researched GDB. I thought I would post them here in case others wanted to implement/execute this or whether I have missed something in my proposed solution.

For LLDB, one needs to:

  1. (Required) Let LLDB know how to identify the mangling scheme & how to de-mangle a symbol
  2. (Optional) Implement a Language plugin for deeper LLDB integration

References:

From reading the source: a unique mangling scheme identifiable from others is needed along with code to de-mangle it. All mangling schemes used by other languages/compilers (C++/Itanium, C++/MSVC, D, Rust) use a prefix to classify how/from what compiler the name was mangled.

For Nim: identifying the mangling scheme/language from a mangled name is more complex. This is because Nim is compiled into a target language that uses an existing mangling scheme. If we had control over the binary or Debug Symbol output file (e.g. DWARF), I believe this would be easier, but again: since the target language's compiler is being used it is slightly more complex.

To solve this with today's standard Nim compiler, here are my researched steps:

  1. Contain/embed a unique constant identifier within each symbol to identify that this symbol was output from the Nim compiler. Modifications to be done here: https://github.com/nim-lang/Nim/blob/502a4486aeb8d0a5dcdf86540522d3dc16960536/compiler/ccgutils.nim#L71
    • This unfortunately would have a chance to overlap with identifiers that are used for C or C++ code in existing codebases. Unicode symbols would allow for rare conflicts but would require C99 or above
      • This probably requires an RFC and further discussion
  2. Modify LLDB:
    1. Modify the Mangle class
      1. Add mangling scheme enum entry for Nim here: https://github.com/llvm/llvm-project/blob/main/lldb/include/lldb/Core/Mangled.h#L41-L48
      2. Classify if the symbol originates from the Nim compiler with the above knowledge: https://github.com/llvm/llvm-project/blob/main/lldb/source/Core/Mangled.cpp#L42-L79
        • Implementation seems to require one-level deep recursion
      3. Call & implement demangling code in C++
        • Getting this accepted to LLDB might be difficult (due to valid C/C++ identifiers). Perhaps a compiler option similar to Apple's LLDB (see here) or a run-time flag would be appropriate here (seems to require many modifications of LLDB, maybe LLVM folks know best here)
  3. (optional): implement a Language plugin. Why? Deeper integration with LLDB

Alternatives

Here are some alternatives I can think of, but will likely require more work:

  1. Modify the nim compiler to output the target assembly directly (or via LLVM), this is related to NIR
    • It would be likely be easier convincing the LLVM/LLDB team to merge the name de-mangling changes for Nim if it did not conflict with C/C++ symbols
  2. Write a debugger in Nim. Pros:
    • Would offer a chance to integrate with the compiler, i.e. to evaluate nimscript in the debugger or to modify the program at run-time / to provide a REPL similar to Swift
    • Reading & modifying the LLDB code is hard with all the OOP/abstraction

Examples

No response

Backwards Compatibility

My proposed solution will change the way the nim compiler mangles, but for backward compatibility: one could offer a flag to mangle the old way. Though I don't think this flag would be necessary: just re-compile your source if you want debugging support.

Links

Mangling & D:

LLDB codepointers:

Writing a debugger:

Zectbumo commented 12 months ago

+1 Please let's write our own debugger.

ire4ever1190 commented 11 months ago

Implementing for GDB would be similar process ^1. Imo adding support to existing debuggers is better than writing our own since it means less maintenance and allows easy integration with existing tools