terralang / terra

Terra is a low-level system programming language that is embedded in and meta-programmed by the Lua programming language.
terralang.org
Other
2.73k stars 201 forks source link

Support for debug symbols in newer LLVM versions #354

Open elliottslaughter opened 5 years ago

elliottslaughter commented 5 years ago

For debug symbol support, you can run git log --graph --decorate --oneline master..develop and grep for debug. There was some work in progress by @zdevito that is probably just worth pulling in, assuming it doesn't cause a lot of churn. However, if I'm reading the commit messages correctly, that only gets us up to LLVM 3.8. In general, I believe debug symbols are controlled by metadata on the LLVM IR, and seems to break more frequently than other aspects of LLVM. I don't think there is a general recipe for what to do except to run Clang and see what LLVM IR it spits out, and try to reverse engineer that.

Originally posted by @elliottslaughter in https://github.com/zdevito/terra/issues/353#issuecomment-465330821

Is there any chance that including C++ header support in terra would be a simple change? Because if we can include the lldb public C++ api, we can get more use out of the debug information. I'll see if I can figure out getting the debug information to work. I don't have that much experience using LLVM directly, but I'll see what I can do.

Originally posted by @aiverson in https://github.com/zdevito/terra/issues/353#issuecomment-465337290

elliottslaughter commented 5 years ago

@aiverson What sort of C++ support do you need exactly? Are you "just" calling C++ functions that could conceptually be marked as extern "C" or do you need more than that?

C++ is a big language to wrap and the more you try to pull in the hairier it gets. I've seen a lot of C++ wrapper projects and so far none have been remotely pretty, so if the source is stable adding a C wrapper is likely the easier way to go.

aiverson commented 5 years ago

LLDB has a public API in C++ with an explicit goal of being simple for automated consumption by wrapper generation tools. It is mostly data containers with a few accessor methods, and no fancy template stuff that I have seen. I don't know if I can count on the binding being stable. SWIG is used by the project to produce python bindings from the API headers.

elliottslaughter commented 5 years ago

I'm fine in theory with parsing C++, my main concerns are that it's a slippery slope and that it may be difficult to set appropriate expectations on what should work and what shouldn't. Certainly, if all you do is turn on C++ mode in Clang and handle the ASTs that you want, it doesn't seem like an unreasonable amount of complexity.

Having said that, I'm pretty sure all the LLVM calls are made from C++ so unless you're also planning to radically rearchitect Terra in the process it shouldn't be necessary to do so (if I understand how you're planning on using this).

aiverson commented 5 years ago

Yeah, it should be doable to make a simple wrapper of the important functionality and export it as part of the terra baselib. I'm not sure I want to make lldb a mandatory dependency of terra to keep the executable size down. Should it have a buildflag to enable it, or should I just include it unconditionally, since executables built by terra wouldn't have it in them.

elliottslaughter commented 5 years ago

How do programs normally use the LLDB API? Presumably not everyone wants to make it a hard dependency. I don't think I even have it installed on my system; in Ubuntu at least it's a separate package and I've never bothered to install it.

If having C++ header support in Terra would make it easier to keep the dependency optional, that might be more motivation for such a feature.

OvermindDL1 commented 5 years ago

How do programs normally use the LLDB API?

The few I've seen generate their AST via libclang instead of LLVM straight... >.>

elliottslaughter commented 5 years ago

I'm still a little confused though. Does Clang have a hard dependency on LLDB? Or is there a liblldb that's somehow separate from the LLDB binary?

Let me put it this way: My standard LLVM build procedure is to go to the link below and download two tarballs, one for LLVM and one for Clang. I've never even downloaded the LLDB tarball let alone built it, so how can Clang have LLDB support?

https://releases.llvm.org/download.html#7.0.1

OvermindDL1 commented 5 years ago

As far as I understand clang doesn't use lldb straight but just decorates the AST nodes as appropriate where lldb is more for debugging into programs made via such properly 'decorated' LLVM ast.

/me has not touched lldb yet, only some minor work with libclang

elliottslaughter commented 5 years ago

Right, that sounds more like what I understood of how things work.

My inclination would be to follow a similar approach to Clang, and figure out what metadata to attach to the LLVM IR to get it to generate the right debug info. Or does the LLDB API make this somehow easier?

@aiverson Or did I somehow misunderstand what you intended to do with LLDB?

aiverson commented 5 years ago

We would still need to attach metadata to the LLVM IR to get the right debug info. If we have the LLDB API, then we get the ability to use the debug info. We would be able to read values off of the stack in the traceback to get extra information. In the terra state, we have access to all of the metadata about the types and convenient ways to manipulate them. Being able to debug a terra program with better introspection and ability to live-edit things seems useful. Writing tools to debug terra code in terra seems useful.

capr commented 5 years ago

I'm having a hard time understanding this issue. Can someone please answer a few questions:

  1. For what LLVM versions exactly does terra generate debug info? IOW when did this stopped working?

  2. Is debug support for Windows completely unavailable? If yes, does anyone know what is needed to implement it? Is there something missing from LLVM or it's just that LLVM doesn't have a cross-platform API to generate debug info, and so it must be generated by hand for each format (DWARF, GDB etc)?

I assume debug.traceback() must be ported since it uses ucontext.h...

elliottslaughter commented 5 years ago
  1. LLVM 3.5. (There is some support for LLVM 3.8 in the develop branch, but it would need to be pulled out and cleaned up.)

  2. I don't know, did you try with LLVM 3.5?

And this is all aside from debug.traceback() because if debug info worked you could at least get backtraces in gdb---right now even that doesn't work.

As far as I understand, LLVM has a internal format for debug info that it maps into the various formats. It's not that we need to support individual formats in Terra. It's that LLVM's break-the-world-in-every-release policy is even more true for debug info than it is for the rest of LLVM. It's also minimally (if at all) documented. So figuring out how to get debug info with Terra amounts to reading the Clang source and/or output and trying to reverse engineer what kind of metadata LLVM expects in a given release... and then you can expect it all to break in the next release again. (Thanks, LLVM!)

capr commented 5 years ago

I see, thanks for the info.

I didn't try it on Windows because I didn't need debugging up until now so when I upgraded my terra build I went for LLVM 6, unknowingly of the debug situation. I also missed the part in the manual where it doesn't work on Windows.

aiverson commented 5 years ago

I'll see if I have time to get back to this any time soon. I started working on updating the debug info to work on newer versions of LLVM, but it wasn't going quickly.