rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
97.34k stars 12.59k forks source link

Tracking Issue for `backtrace_frames` #79676

Open seanchen1991 opened 3 years ago

seanchen1991 commented 3 years ago

This is a tracking issue for one of the features mentioned in #81022, the ability to create an iterator over backtrace frames. The feature gate for the issue is #![feature(backtrace_frames)].

About tracking issues

Tracking issues are used to record the overall progress of implementation. They are also used as hubs connecting to other relevant issues, e.g., bugs or open design questions. A tracking issue is however not meant for large scale discussion, questions, or bug reports about a feature. Instead, open a dedicated issue for the specific matter and add the relevant feature gate label.

Steps

Unresolved Questions

Implementation history

seanchen1991 commented 3 years ago

@rustbot modify labels +T-libs

rustbot commented 3 years ago

Error: Label B-unstable can only be set by Rust team members

Please let @rust-lang/release know if you're having trouble with this bot.

petrochenkov commented 3 years ago

Apologies if I'm late to the party, or if all of this was discussed before, but I'd suggest studying the C++ stacktrace proposal (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p0881r7.html) before stabilizing things like this.

The interface is based on Boost.Stacktrace and was heavily scrutinized and changed in non-obvious ways to accommodate for some backtrace operations being expensive or platform-dependent.

seanchen1991 commented 3 years ago

@petrochenkov Looking through the linked document, I don't really see anything that is within scope of this PR. Please let me know if there's any specific points in the doc that should be brought up though; I certainly might have missed something.

KodrAus commented 3 years ago

The Backtrace::frames method currently returns a &'a [BacktraceFrame], which gives us slicing, indexing, and iteration for free. We might want to consider a BacktraceFrames<'a> wrapper though that gives us more wiggle room for lazy resolution. It'll just mean adding more types and boilerplate since we'll need to implement traits for all the iteration and slicing ourselves.

iddm commented 1 year ago

I have just looked at the Backtrace stabilisation announcement in 1.65. However, I wonder, what's the use of it if we can't traverse the frames? I believe the primary use of a stack trace like that would be the possibility to walk over the frames and be able to analyse them in any manner rather than just printing. For example, I'd think of having something like this: https://docs.oracle.com/javase/7/docs/api/java/lang/StackTraceElement.html How come the "Backtrace" is said to be stabilised while the most crucial and juicy part of it isn't?

John-Nagle commented 1 year ago

I see this issue has been in progress for over two years now.

Here's a use case. I have a GUI program, a metaverse client. One of those things for end users who are not programmers. There's no text console. There are multiple threads. If a thread panics, I need to capture a backtrace, log it, tell the user something via a GUI dialog, and shut down the program.

The only thing you can do with a Backtrace right now is format it. You can't iterate over it in production code. Can't even clone it. If you format it, 80% of the output is the panic and backtrace system doing its thing..

I'd like to have both the ability to iterate over a backtrace, and something that extracts only the frames relevant to the cause of the panic. In the playground linked here, the only two relevant lines, out of 30, are:

7: playground::called_by_worker
             at ./[src/main.rs:5](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=694b6273d3b16c092591de206eefaa17#):5
8: playground::worker
             at ./[src/main.rs:9](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=694b6273d3b16c092591de206eefaa17#):5

Everything else is the panic and backtrace system's own machinery. And the included URLs are useless unless you're in a development environment. None of this stuff is oriented towards reporting problems detected in the field.

Could be worse. What you get from catch_unwind is an Any that can't even be formatted.

Thomasdezeeuw commented 1 year ago

The only thing you can do with a Backtrace right now is format it. You can't iterate over it in production code. Can't even clone it. If you format it, 80% of the output is the panic and backtrace system doing its thing..

I've been thinking about reducing the backtrace length as well. It would be great if we could use something like std::panic::Location to trim backtraces to the starting point we're interested in, skipping over the frames for the function(s) to handle and capture the backtrace.

Could be worse. What you get from catch_unwind is an Any that can't even be formatted.

You can actually convert the value into a String or &str, see https://github.com/Thomasdezeeuw/heph/blob/8725fe9ed871f135b7ab87b5969eb1213121a242/src/actor/future.rs#L196-L204. Although it's not guaranteed to catch everything I think it handles most of it (at least all panics I've seen in years of Rust programming)

John-Nagle commented 1 year ago

Yes, reducing the backtrace length to the relevant content would help considerably.

Use case: for emergency dialogs, there's rfd::MessageDialog. This will usually work even if the main GUI has panicked. That dialog box doesn't scroll, so the text length has to be limited to 10-20 lines at most.

John-Nagle commented 1 year ago

Illustration of why excessively verbose backtraces with no reliable way to reduce them are a problem. Note that the dialog box has been forced off the screen. panicdialogfail We have a problem. The "OK" button is now off-screen.

So, there's now an ugly workaround: apply suitable not very portable screen-scraping regular expression to backtrace output to extract frames:

/// Split backtrace into frames, remove frames from Rust internals.
/// Otherwise, backtraces are mostly the panic system's internals and won't fit.
/// This is very dependent on the format of backtrace output.
/// It looks for the number: at the beginning of each frame.
///
/// This is a workaround for https://github.com/rust-lang/rust/issues/79676
fn trim_backtrace(s: &str) -> Vec<&str> {
    let splitter = Regex::new(r"(^|\n)\s+\d+:").unwrap();
    splitter.split(s).filter(|frame| !frame.contains("/rustc/") && !frame.is_empty()).collect()   // split frames
}

trimmedbacktrace Successful application of big-hammer workaround. The user sees the message, clicks OK, and the program exits cleanly.

fraillt commented 1 year ago

I hope I'm not too late to the party, and I want to express few issues that I have with current (nightly) state. So there are some facts, that I recently found out:

I have created PR where I describe my problem better, and propose very simple solution, but for std;;backtrace I think we need better design.

I think we need to address few problems:

With all that said, here are some thoughts:

John-Nagle commented 1 year ago

resolving frames can be REALLY SLOW (i'm talking about 0.5s to resolve single frame!!!)

Yes, although that's not intolerable for handing an error just before final exit.

random access to frames

Please don't over-complicate this or spend years bikeshedding. This is part of an emergency abort, not an introspection system. GUI programs, programs which automatically submit crash reports, and programs with no text console need this to report panics. Thanks.

iddm commented 1 year ago

resolving frames can be REALLY SLOW (i'm talking about 0.5s to resolve single frame!!!)

Yes, although that's not intolerable for handing an error just before final exit.

random access to frames

Please don't over-complicate this or spend years bikeshedding. This is part of an emergency abort, not an introspection system. GUI programs, programs which automatically submit crash reports, and programs with no text console need this to report panics. Thanks.

I wouldn't assume how and why it would be used. As long as something exists, one may never guess how one want to use it. For example, there was an "error-chain" crate that also used the stack traces for each "Error" it created. If it were to be called and printed (let's say, logged) within a loop, it would be a disaster.

Not to mention that I fail to see any reason not to look for a possible speed-up for the traversal of the entries and their resolution. If it can be speed up with some effort, it should always be done, it should be proven otherwise - why not to do it.

fraillt commented 1 year ago

I don't want to bikeshed as well, I just wanted to say that Backtrace::frames is propably good as is, but I don't mind if it changes to iterator as well. The main problem that I have is that being general-purpose language that emphasizes on performance, shouldn't simply assume that it is ok, to print something to string for 1s. Part of being general-purpose also mean that there might be many usecases. My case, as I described in PR was that I looked only at few frames, and performance increase was from 700+ms down to 25ms, and this is observable by our users.