StanfordLegion / legion

The Legion Parallel Programming System
https://legion.stanford.edu
Apache License 2.0
689 stars 144 forks source link

Realm: backtrace supports file names and line numbers #1791

Open eddy16112 opened 5 days ago

eddy16112 commented 5 days ago

We removed the libdw code and plan to bring back the feature of showing the file names and line numbers for backtraces using third party libraries.

There are two candidates:

I played with them, and both libraries should be able to satisfy our needs. They both support cmake. For the cmake system, we can let cmake fetch the library for us. For the makefile system, my plan is to automatically look for the headers/libraries in the default system directory, if not found, we will ask users to set the ROOT folder (just like what we did for CUDA).

@elliottslaughter @muraj @apryakhin @lightsighter Please let me know what is your preference.

lightsighter commented 5 days ago

I think if we're going to have a dependency, we should pull the source code into our repo to version lock it and attribute its license. I don't like doing this because it's messy, but I prefer that over having our code break because somebody changes something in a remote repo (either the code itself or the remote repo's build system).

This should also be mostly irrelevant once c++23 comes around. We should already start building that path since it looks like there's already early support for <stacktrace> in some compilers: https://en.cppreference.com/w/cpp/compiler_support/23

I'm also going to ask this again: is it really so hard to roll our own backtrace functionality on top of libunwind? Why do we need a custom backtrace library?

eddy16112 commented 5 days ago

I think if we're going to have a dependency, we should pull the source code into our repo to version lock it and attribute its license.

We do not need fetch the source code into our repo. CMake is able to fetch a specific commit. We can manually implement it for Makefile if necessary.

I'm also going to ask this again: is it really so hard to roll our own backtrace functionality on top of libunwind? Why do we need a custom backtrace library

No, it is not hard. If you look at this PR https://gitlab.com/StanfordLegion/legion/-/merge_requests/1542/diffs, the die_has_pc and find_fundie_by_pc were copied from backward-cpp. I can tweak them to make them "look different" from the original code, but TBH, I do not know how to determine the similarity of two piece of code, however, the methodology might be the same as backward-cpp.

lightsighter commented 5 days ago

We do not need fetch the source code into our repo. CMake is able to fetch a specific commit. We can manually implement it for Makefile if necessary.

And what happens when that commit or release from that remote repo is deleted?

I can tweak them to make them "look different" from the original code, but TBH, I do not know how to determine the similarity of two piece of code, however, the methodology might be the same as backward-cpp.

You're already contaminated by looking at the code. I think someone else (maybe me) would need to write it from scratch without looking at the implementation of backward-cpp and just using the documentation for libunwind. I'm willing to do this if it means we can avoid introducing a dependency.

eddy16112 commented 5 days ago

And what happens when that commit or release from that remote repo is deleted?

We can lock to a release version.

I'm willing to do this if it means we can avoid introducing a dependency.

We were using https://github.com/roolebo/elfutils/tree/master to look up debug symbols. Very limited documentation. I just realized another copyright issue. The elfutils is under GPL 3.0, if we use it, I am not sure if we can license Realm under Apache. If We want to avoid the elfutils, an alternative is libdwarf, but it is much more complicated. Therefore, I would prefer to use either backward-cpp or cpptrace with the libdwarf backend.

muraj commented 4 days ago

I'm also going to ask this again: is it really so hard to roll our own backtrace functionality on top of libunwind? Why do we need a custom backtrace library?

Keep in mind that libunwind doesn't support windows, as it's for ELF programs. If we write our own, we're going to basically duplicate what these other libraries do already except we have to maintain it ourselves for various platforms. We have enough work to do with all the other things realm does, there doesn't seem to be a real reason backtrace support needs to be one of them, especially given the complexity of licensing in this area.

This should also be mostly irrelevant once c++23 comes around.

Sure, and when do you think HPC systems will adopt this ubiquitously? Maybe when I retire? We require c++17, I wouldn't feel comfortable requiring anything higher at this time.

I am not sure if we can license Realm under Apache.

We can license our software however we want, but use of GPL requires open source, which can be a big problem for folks that want to maintain a closed source application above us. We will need to remove elfutils as well, so yes, we need to replace it with libdwarf.

And what happens when that commit or release from that remote repo is deleted?

I don't think this is a problem we need to worry about, the libraries we're talking about are pretty popular, and we would target a specific tagged commit (a common tactic in cmake-based systems that is pretty robust in most external library uses). If an issue does arise, then we can just fix it on our end. I would rather not check in other people's code into our own repo, as that makes it difficult to update external dependencies when we need to. The other alternative is to use git submodules, which I'm all for, but I know people have been very much against the idea.

Regardless, I'd be down for using backwards-cpp as a header-only library. It should help make integrating into Makefile and CMake easier.

eddy16112 commented 4 days ago

@muraj I am afraid we will have to use cpptrace. The backward-cpp (libdawrf backend) needs libdwarf from https://github.com/davea42/libdwarf-code/tree/main, and libelf. There are several libelf, the one without GPL is https://wiki.freebsd.org/LibElf. Even though backward-cpp is header only, we still need to compile these two dependencies. I think that's why backward-cpp prefers to use the elfutils backend.

The cpptrace needs libdwarf, but not libelf. It includes libdwarf in its CMake file, so when compiling cpptrace, libdwarf will be downloaded and compiled together, which makes our life much easier. The libdwarf is under LGPL, so close source is fine as long as we dynamic link against it.