crystal-lang / crystal

The Crystal Programming Language
https://crystal-lang.org
Apache License 2.0
19.33k stars 1.61k forks source link

Improve Exception backtrace API #10681

Open Blacksmoke16 opened 3 years ago

Blacksmoke16 commented 3 years ago

Currently an exception's backtrace is an Array(String). This is usually fine, however there are some use cases where you may want to present the trace in a different way; such as pretty printing an exception. Currently this is not very easy to do as each portion of the trace needs to be manually parsed either manually or via an external shard.

I propose that the backtrace is also exposed an a like Array(Frame), where each Frame would have getters for the related info of that line. E.g. filename, method name, line/column etc. This would provide a much cleaner API to work with.

Given Crystal handles printing each line I would assume it inherently has access to each portion of data. This could be kept backwards compatible by exposing the frames as another method that the current one could leverage. E.g. @frames.map &.to_s.

straight-shoota commented 3 years ago

It looks Rust is currently (although for quite some time already) refactoring their backtrace API with a BacktraceFrame type (https://github.com/rust-lang/rust/issues/79676, https://doc.rust-lang.org/std/backtrace/index.html, https://github.com/rust-lang/rust/issues/53487). That could certainly serve as inspiration.

Blacksmoke16 commented 3 years ago

I think it's worth pointing out the two PRs open about this are mutually exclusive. We should probably figure out which one we want to move forward with and go from there, versus needing to review both all the time...

asterite commented 3 years ago

True. Also probably discuss outside of here what to expose. What's the reason to expose the address and the cryptic function name?

Blacksmoke16 commented 3 years ago

Is there reason to not expose the address? I guess my thinking was even if most people won't ever need it, still might be helpful to expose; especially if we're going for a 1:1 mapping between the backtrace string and a Frame instance.

Regarding the function name, I combined both names into 1 function ivar. Mainly because currently there isn't a case where you'd have both (https://github.com/crystal-lang/crystal/blob/master/src/exception/call_stack.cr#L199-L201) in a given backtrace string. So technically it is still exposed, but the API is the same.

asterite commented 3 years ago

Let's say someone implements an interpreter for Crystal. What would be the address?

asterite commented 3 years ago

Ah, nevermind, in any case it won't happen.

straight-shoota commented 3 years ago

Let's please focus the discussion on the proposed API here in this issue, instead of arguing over details in the competing PRs.

straight-shoota commented 3 years ago

I think we should start with ironing out the data format for a frame.

I've looked at Rust's implementation (https://doc.rust-lang.org/src/std/backtrace.rs.html#152-155) and they actually separate frame and symbols and multiple symbols can be associated with one frame. I don't know when you would have multiple symbols per frame. But there must be a reason for that.

The data structures (converted to the Crystal equivalent) look like this:

record BacktraceFrame,
  frame : RawFrame,
  symbols : Array(BacktraceSymbol)

record BacktraceSymbol,
  name : String?,
  filename : String?,
  line_number : Int32?,
  column_number : Int32?

The process is also split into two parts, first retrieving the raw frames from whatever backend is used to walk the stack. The result is a collection of BacktraceFrame with empty symbols. The second step resolves the frames against the debug information and fills the symbols.

rdp commented 2 years ago

Rust exposes the ip addresses. No idea who would use them though, but I guess since it's a struct people are free to ignore it. Crystal even has an "original_ip" but again not sure if anyone cares but might not hurt. The reason Rust does multiple symbols per frame is Normally there is only one symbol per frame, but sometimes if a number of functions are inlined into one frame then multiple symbols will be returned. The first symbol listed is the "innermost function", whereas the last symbol is the outermost (last caller). https://docs.rs/backtrace/0.3.2/backtrace/struct.BacktraceFrame.html Also, apparently there are two ways to lookup the function name. The dwarf has just the name and dlAddr has the name with "parameter types" which I think might be also useful...FWIW.

Fryguy commented 2 years ago

Not to derail the thread, but I just noticed @asterite's foreshadowing 😂 ☝️ https://github.com/crystal-lang/crystal/issues/10681#issuecomment-835854880

asterite commented 2 years ago

It's fine, we can leave the ip address as zero for interpreted mode. In the end backtraces in the interpreter are done in a completely different way, so it makes sense that the values there will also be slightly different.

asterite commented 2 years ago

(I also wrote that comment while I was writing the interpreter, so I knew what was coming :-P)

Blacksmoke16 commented 2 years ago

Getting back to this issue again and got the data model proposed in https://github.com/crystal-lang/crystal/issues/10681#issuecomment-845129773 working, but still have a few questions:

Do we still think its necessary to support having multiple symbols per frame?

Based on https://github.com/crystal-lang/crystal/issues/10681#issuecomment-1016103785, I'm just not sure if this is a rust specific compiler thing, or if it's also applicable to us given it also uses LLVM. Prob worth investigating/finding a scenario where there would be more than one to test with/how to extract that data.

What to do with CRYSTAL_CALLSTACK_FULL_INFO ENV var?

Currently i'm still doing essentially what it was before by using sname as function and also having it set the ip address on each symbol if its set.

Do we actually want to still expose the address?

I'm assuming yes based on the past comments.

What do we ultimately want the API for this to be like?

Do we want to actually expose the BacktraceFrame type, versus like yielding each of its symbols? If not, then Exception#frames and Exception#each_frame that returns an array of/yields each BacktraceSymbol would comes to mind first. Tho the naming of that is a bit weird given its called "frames" but you get a "symbol"? So could just as easily go with #symbols and #each_symbol if that'd work better.

The second step resolves the frames against the debug information and fills the symbols.

Do we care at all this would be a mutable struct no?

rdp commented 1 year ago

My wont would be to have lots of info returned, people can choose not to use it if they care. Maybe nodoc the ip address, can't imagine someone needing it externally? :)

On Mon, Sep 5, 2022 at 8:58 PM George Dietrich @.***> wrote:

Getting back to this issue again and got the data model proposed in #10681 (comment) https://github.com/crystal-lang/crystal/issues/10681#issuecomment-845129773 working, but still have a few questions:

Do we still think its necessary to support having multiple symbols per frame?

Based on #10681 (comment) https://github.com/crystal-lang/crystal/issues/10681#issuecomment-1016103785, I'm just not sure if this is a rust specific compiler thing, or if it's also applicable to us given it also uses LLVM. Prob worth investigating/finding a scenario where there would be more than one to test with/how to extract that data.

What to do with CRYSTAL_CALLSTACK_FULL_INFO ENV var?

Currently i'm still doing essentially what it was before by using sname as function and also having it set the ip address on each symbol if its set.

Do we actually want to still expose the address?

I'm assuming yes based on the past comments.

What do we ultimately want the API for this to be like?

Do we want to actually expose the BacktraceFrame type, versus like yielding each of its symbols? If not, then Exception#frames and Exception#each_frame that returns an array of/yields each BacktraceSymbol would comes to mind first. Tho the naming of that is a bit weird given its called "frames" but you get a "symbol"?

— Reply to this email directly, view it on GitHub https://github.com/crystal-lang/crystal/issues/10681#issuecomment-1237604525, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAADBUGDDBKXQ4N2ZH47XTTV42XNHANCNFSM44GCZ55Q . You are receiving this because you commented.Message ID: @.***>