rust-lang / rust

Empowering everyone to build reliable and efficient software.
https://www.rust-lang.org
Other
95.22k stars 12.28k forks source link

Tracking issue for `?` operator and `try` blocks (RFC 243, `question_mark` & `try_blocks` features) #31436

Open nikomatsakis opened 8 years ago

nikomatsakis commented 8 years ago

Tracking issue for rust-lang/rfcs#243 and rust-lang/rfcs#1859.

Implementation concerns:

arielb1 commented 8 years ago

@eddyb

Won't that lead to binary-size bloat?

eddyb commented 8 years ago

@arielb1 You'd do it in debug mode only where the debuginfo bloats the binary anyway - you could also reuse the debuginfo cleverly shrug.

mitsuhiko commented 8 years ago

@eddyb what is a Location in that case? Not sure what's hard about reading the IP is. Sure, you need custom JS for each target but that's not really all that hard.

eddyb commented 8 years ago

@mitsuhiko it could be the same info we use for overflow checks and other compiler-emitted panics. See core::panicking::panic.

est31 commented 8 years ago

Why pack so much stack information to an Error/Result while the stack is unwound? Why not just run the code while the stack is still there? E.g. what if you are interested in variables on the path of the stack? Just enable people to run custom code when an Err gets invoked, e.g. to call up a debugger of their choice. This is what try! already provides due to being an (overrideable) macro. The easiest way is to panic on an Err case which prints the stack provided one has started the program with the right params.

With try you can do anything you want on an Err case, and you can override the macro crate wide, or very scoped to not touch performance critical code e.g. if the error is hard to reproduce and you need to run alot of the perf critical code.

Nobody would need to preserve a fraction of the information in an artificial stack one builds up because the real one is going to be destroyed. All the macro overriding method could be improved with is:

eddyb commented 8 years ago

@est31 The stack is not automatically "unwound" like in a panic, this is syntactical sugar for early returns. And yes, you could have the compiler insert calls of some empty functions with fixed names that you can breakpoint on, this is pretty easy (and also have information that lets you do, e.g. call dump(data) - where dump and data are arguments debuggers can see).

hexsel commented 8 years ago

@eddyb I think empty functions would not allow a use-case of, e.g. keeping a few "canary" debug instances on a large deployment to see if errors appear in logs, to then go back and fix things. I understand it is preferrable to be proactive than reactive, but not everything is easy to predict

eddyb commented 8 years ago

@hexsel Yeah, which is why I prefer the TLS-based method where Result::unwrap or some new ignore method (or even always when dropping the error?) dumps the trace to stderr.

est31 commented 8 years ago

@eddyb If you add the instruction pointer or something derived from the value to a stack like data structure in TLS then you basically rebuild your small version of the real life stack. A return reduces the stack by one entry. So if you do that while you return you unwind part of the stack, while building a limited version of it at some other place in RAM. Perhaps "unwind" is the wrong term for behaviour resulting from "legal" returns, but if all code does ? or try! and at the end you intercept the end result is the same. Its great that rust does not make error propagation automatic, I really liked how java had required all exceptions to be listed after the throws keyword, rust is a good improvement on that.

@hexsel an overriding (as it already exists now for try!) based approach would allow for this -- you can run any code you wish, and log to any logging system. You would need some detection for "dupes" when multiple try's catch the same error as it propagates up the stack though.

eddyb commented 8 years ago

@est31 Overriding try! only works in your own code (it's literally macro import shadowing), it would have to be something different, like our "weak lang items".

@est31 That's not really correct (about unwinding), the traces and the real stack don't necessarily have any relationship, because moving Results around doesn't have to go up in the original backtrace, it can go sideways too. Also, if the binary is optimized, most of the backtrace is gone, and if you don't have debug info, then Location is strictly superior. You could even be running an obfuscating plugin which replaces all the source information with random hashes that can be matched by the developers of some closed source product.

Debuggers are useful (and as an aside, I'm loving lldb's cleaner backtrace output), but they're not a panacea, and we already output some information on panics so you can get some clues as to what's going on.

About that - I had some thoughts about a typesystem-level trick where {Option,Result}::unwrap would have an extra type argument, defaulting to a type dependent on the location the function was called from, such that the panics from those would have way more useful location information.

With progress on value parametrization, that might still be an option, but definitely not the only one, and I don't want to outright dismiss Result traces, instead I'm trying to find a model which is implementable.

mitsuhiko commented 8 years ago

Overriding try! is not a solution at all because it's contained within your own crate. That's unacceptable as debugging experience. I already tried plenty of that trying to deal with the current try!. Especially if you have many crates involved and errors get transmuted multiple times on the way through the stack it's nearly impossible to figure out where the error originates and why. If a bandaid like that is the solution then we will have to revisit error handling in general for large Rust projects.

@eddyb so is your suggestion that we literally embed the filename, function-name, line number and column number to that vector? That seems hugely wasteful particularly because that information is already contained in much more processable format in DWARF. Also DWARF lets us use the same process reasonably cheaply in production whereas this sort of location info appears to be so wasteful that nobody would ever run a release binary with it.

eddyb commented 8 years ago

How would it be significantly more wasteful than DWARF information? Filenames would be deduplicated, and on x64 the whole thing is the size of 3 pointers.

hexsel commented 8 years ago

@mitsuhiko so basically, you agree with the general direction, but not with the technical specifics of it?

How easy is it to expose DWARF information to a general Rust API?

mitsuhiko commented 8 years ago

@eddyb because DWARF information is not contained within the release binary but separate files. So I can have the debug files on symbols servers and ship a stripped binary to users.

eddyb commented 8 years ago

@mitsuhiko Oh, that's an entirely different approach than what I was assuming. Rust doesn't currently support that atm, AFAIK, but I agree it should.

Do you think instruction pointers are actually that useful compared to random identifiers generated by your build system for the purposes of debugging release binaries?

My experience has been that with any inlining debuggers have a hard time recovering much of the stack trace, except for explicit self/mutual recursion and very large functions.

est31 commented 8 years ago

Yes, try! is contained within your crate. If you pass a function pointer or similar to library code, and there is an error in your function, the try approach will still work. If a crate you use has an internal error or bug, you might need its source code in order to debug already, if the returned Err's information does not help. Stack traces are only helpful if you have access to the code, so closed source libraries (whose code you can't modify) will go have to go over support one way or another.

What about simply enabling both approaches, and let the developer decide what's best for them? I do not deny that the TLS based approach doesn't have any advantages.

The try model is implementable very easily, just desugar ? to try, only extra language extensions are needed if catch comes in (you would have added that behaviour inside the hardcoded behaviour of ? anyway)

mitsuhiko commented 8 years ago

@eddyb "Do you think instruction pointers are actually that useful compared to random identifiers generated by your build system for the purposes of debugging release binaries?"

That's how debugging native binaries in general works. We (Sentry) are using that almost entirely for the iOS Support at this point. We get a crash dump and resolve the addresses with llvm-symbolizer to the actual symbols.

eddyb commented 8 years ago

@mitsuhiko Since we already embed libbacktrace, we could use that to resolve code pointers to source locations, so I'm not entirely opposed to it.

mitsuhiko commented 8 years ago

@eddyb yeah. Just looked at the panic code. Would be nice if that was actually an API that Rust code could use. I can see this being useful in a few more places.

glaebhoerl commented 8 years ago

About that - I had some thoughts about a typesystem-level trick where {Option,Result}::unwrap would have an extra type argument, defaulting to a type dependent on the location the function was called from, such that the panics from those would have way more useful location information.

Speaking of which...

eddyb commented 8 years ago

@glaebhoerl Hah, maybe that's actually worth pursuing then? At least as some unstable experiment.

glaebhoerl commented 8 years ago

@eddyb No idea. Probably worth discussing it with some GHCers involved with it first, I only heard about this in passing and googled a link just now. And Rust doesn't have actual implicit parameters in the way that GHC does; whether default type parameters could work instead is an interesting question (probably one you've given more thought to than I).

DanielKeep commented 8 years ago

Just a thought: it'd be useful to have a switch that causes rustc to special-case constructing an Err such that it calls a fn(TypeId, *mut ()) function with the payload before returning. That should be enough to start with basic stuff like filtering errors based on the payload, trapping into a debugger if it sees an error of interest, or capturing backtraces for certain kinds of errors.

nrc commented 8 years ago

PR 33389 adds experimental support for the Carrier trait. Since it wasn't part of the original RFC, it should get a particularly close period of examination and discussion before we move to FCP (which should probably be separate to the FCP for the rest of the ? operator). See this discuss thread for more details.

tomaka commented 8 years ago

I'm opposed to extending ? to Options.

The wording of the RFC is pretty clear about the fact that the ? operator is about propagating errors/"exceptions". Using Option to report an error is the wrong tool. Returning None is part of the normal workflow of a successful program, while returning Err always indicates an error.

If we ever want to improve some areas of errors handling (for example by adding stack traces), implementing ? on Option means that we will have to exclude ? from the changes.

nikomatsakis commented 8 years ago

@tomaka could we keep the discussion on the discuss thread? (I've summarized your objection already) I personally find that long discussions on GH get pretty unwieldy, and it'd also be nice to be able to separate discussion this particular point from other future points or questions that may arise.

glaebhoerl commented 8 years ago

@eddyb Here's the released-version docs of the implicit callstacks GHC feature I mentioned earlier.

brson commented 8 years ago

No updates here in a while. Is anybody working to move this forward? Are the remaining tasks in the OP still accurate? Can anything here be E-help-wanted?

mitsuhiko commented 8 years ago

I played around last weekend if it would be possible to write a Sentry client for Rust that makes any sense. Since most error handling is actually based on results now instead of panics I noticed that the usefulness of this is severely limited to the point where I just decided to abandon this entirely.

I went to the crates.io codebase as an example to try to integrate an error reporting system into it. This brought me back to this RFC because I really think that unless we can record the instruction pointer somehow as results are passed down and converted in the different stacktraces it will be impossible to get proper error reporting. I already see this being a massive pain just debugging local complex logic failures where I nowadays resort to adding panics where I think the error is coming from.

Unfortunately I currently do not see how the IP could be recorded without massive changes to how results work. Has anyone else played around with this idea before?

nikomatsakis commented 8 years ago

We were discussing this in the @rust-lang/lang meeting. Some things that came up:

First, there is definite interest in seeing ? stabilizing as quickly as possible. However, I think most of us would also like to see ? operating on Option and not just Result (but not, I think, bool, as has also been proposed). One concern about stabilizing is that if we stabilize without offering any kind of trait that would permit types other than Result to be used, then it is not backwards compatible to add it later.

For example, I myself write code like this with some regularity:

let v: Vec<_> = try!(foo.iter().map(|x| x.to_result()).collect());

where I am relying on try! to inform type inference that I expect collect to return a Result<Vec<_>, _>. If were to use ?, this inference might fail in the future.

However, in previous discussions we also decided that an amendment RFC was needed to discuss the finer points of any sort of "carrier" trait. Clearly this RFC should be written ASAP, but we'd prefer not to block progress on ? on that discussion.

One thought we had was that if we took @nrc's implementation and we made the trait unstable and implemented it only for Result and some private dummy type, that ought to suppress inference while still only making ? usable with Result.

One final point: I think most of us would prefer that if you use ? on an Option value, it requires that your function's return type must also be Option (not e.g. Result<T,()>). It's interesting to note that this will help with the inference limitations, since we can ultimately infer from the declared return type in many cases what type is required.

The reason for not wanting inter-conversion is that it seems likely to lead to loose logic, sort of analogous to how C permits if x even when x has integral type. That is, if Option<T> denotes a value where None is part of the domain of that value (as it should), and Result<> represents (typically) the failure of a function to succeed, then assuming that None means the function should error out seems suspect (and like a kind of arbitrary convention). But that discussion can wait for the RFC, I suppose.

rkjnsn commented 8 years ago

The reason for not wanting inter-conversion is that it seems likely to lead to loose logic, sort of analogous to how C permits if x even when x has integral type. That is, if Option<T> denotes a value where None is part of the domain of that value (as it should), and Result<> represents (typically) the failure of a function to succeed, then assuming that None means the function should error out seems suspect (and like a kind of arbitrary convention). But that discussion can wait for the RFC, I suppose.

I heartily agree with this.

glaebhoerl commented 7 years ago

Another question we had agreed to gate stabilization on was nailing down the contracts that impls of the From trait should obey (or whatever trait we wind up using for Err-upcasting).

nikomatsakis commented 7 years ago

@glaebhoerl

Another question we had agreed to gate stabilization on was nailing down the contracts that impls of the From trait should obey (or whatever trait we wind up using for Err-upcasting).

Indeed. Can you refresh my memory and start out with some examples of things you would think should be forbidden? Or perhaps just the laws you have in mind? I have to admit that I am wary of "laws" like this. For one thing, they have a tendency to get ignored in practice -- people take advantage of the actual behavior when it suits their purposes, even if it goes beyond the intended limitations. So that leads to one other question: if we had laws, would we use them for anything? Optimizations? (Seems unlikely to me, though.)

alexbool commented 7 years ago

By the way, what is the state of catch expressions? Are they implemeted?

nikomatsakis commented 7 years ago

Sadly no :(

On Tue, Jul 26, 2016 at 06:41:44AM -0700, Alexander Bulaev wrote:

By the way, what is the state of catch expressions? Are they implemeted?


You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub: https://github.com/rust-lang/rust/issues/31436#issuecomment-235270663

alexbool commented 7 years ago

Perhaps you should plan implementing it? There is nothing more depressing than accepted but not implemented RFCs...

nrc commented 7 years ago

cc #35056

Stebalien commented 7 years ago

FYI, https://github.com/rust-lang/rfcs/pull/1450 (types for enum variants) would open up some interesting ways implement Carrier. For example, something like:

trait Carrier {
    type Success: CarrierSuccess;
    type Error: CarrierError;
}

trait CarrierSuccess {
    type Value;
    fn into_value(self) -> Self::Value;
}

// (could really use some HKT...)
trait CarrierError<Equivalent: CarrierError> {
    fn convert_error(self) -> Equivalent;
}

impl<T, E> Carrier for Result<T, E>
{
    type Success = Result::Ok<T, E>;
    type Error = Result::Err<T, E>;
}

impl<T, E> CarrierSuccess for Result::Ok<T, E> {
    type Value = T;
    fn into_value(self) -> Self::Value {
        self.0
    }
}

impl<T, E1, E2> CarrierError<Result::Err<T, E2>> for Result::Err<T, E1>
    where E2: From<E1>,
{
    fn convert_error(self) -> Result::Err<T, E2> {
        Err(self.into())
    }
}

impl<T> Carrier for Option<T>
{
    type Success = Option::Some<T>;
    type Error = None;
}

impl<T> CarrierSuccess for Option::Some<T> {
    type Value = T;
    fn into_value(self) -> Self::Value {
        self.0
    }
}

impl<T> CarrierError<Option::None> for Option::None {
    fn convert_error(self) -> Option::None {
        self
    }
}

fn main() {
    let value = match might_be_err() {
        ok @ Carrier::Success => ok.into_value(),
        err @ Carrier::Error => return err.convert_error(),
    }
}
nikomatsakis commented 7 years ago

I just wanted to cross-post some thoughts from https://github.com/rust-lang/rust/pull/35056#issuecomment-240129923. That PR introduces a Carrier trait with dummy type. The intention was to safeguard -- in particular, we wanted to stabilize ? without having to stabilize its interaction with type inference. This combination of trait plus dummy type seemed like it would be safely conservative.

The idea was (I think) that we would then write-up an RFC discussing Carrier and try to modify the design to match, stabilizing only when we were happy with the overall shape (or possibly removing Carrier altogether, if we can't reach a design we like).

Now, speaking a bit more speculatively, I anticipate that, if we do adopt a Carrier trait, we would want to disallow interconversion between carriers (whereas this trait is basically a way to convert to/from Result). So intuitively if you apply ? to an Option, that's ok if the fn returns Option; and if you apply ? to a Result<T,E>, that's ok if the fn returns Result<U,F> where E: Into<F>; but if you apply ? to an Option and the fn returns Result<..>, that's not ok.

That said, this sort of rule is hard to express in today's type system. The most obvious starting point would be something like HKT (which of course we don't really have, but let's ignore that for now). However, that's not obviously perfect. If we were to use it, one would assume that the Self parameter for Carrier has kind type -> type -> type, since Result can implement Carrier. That would allow us to express things like Self<T,E> -> Self<U,F>. However, it would not necessarily apply to Option, which has kind type -> type (all of this would of course depend on precisely what sort of HKT system we adopted, but I don't think we'll go all the way to "general type lambdas"). Even more extreme might be a type like bool (although I don't want to implement Carrier for bool, I would expect some people might want to implement Carrier for a newtype'd bool).

What I had considered is that the typing rules for ? could themselves be special: for example, we might say that ? can only be applied to a nominal type Foo<..> of some kind, and that it will match the Carrier trait against this type, but it will require that the return type of the enclosing fn is also Foo<..>. So we would basically instantiate Foo with fresh type parameters. The downside of this idea is that if neither the type that ? is being applied to nor the type of the enclosing fn is known, we can't enforce this constraint without adding some new kind of trait obligation. It's also rather ad-hoc. :) But it would work.

Another thought I had is that we might reconsider the Carrier trait. The idea would be to have Expr: Carrier<Return> where Expr is the type ? is applied to and Return is the type of the environment. For example, perhaps it might look like this:

trait Carrier<Target> {
    type Ok;
    fn is_ok(&self) -> bool; // if true, represents the "ok-like" variant
    fn unwrap_into_ok(self) -> Self::Ok; // may panic if not ok
    fn unwrap_into_error(self) -> Target; // may panic if not error
}

Then expr? desugars to:

let val = expr;
if Carrier::is_ok(&val) {
    val.unwrap_into_ok()
} else {
    return val.unwrap_into_error();
}

The key difference here is that Target would not be the error type, but a new Result type. So for example we might add the following impl:

impl<T,U,E,F> Carrier<Result<U,F>> for Result<T,E>
    where E: Into<F>
{
    type Ok = T;
    fn is_ok(&self) -> bool { self.is_ok() }
    fn unwrap_into_ok(self) -> Self::Ok { self.unwrap() }
    fn unwrap_into_error(self) -> { Err(F::from(self.unwrap_err())) }
}

And then we might add:

impl<T> Carrier<Option<T>> for Option<T> {
    type Ok = T;
    fn is_ok(&self) -> bool { self.is_some() }
    fn unwrap_into_ok(self) -> Self::Ok { self.unwrap() }
    fn unwrap_into_error(self) -> { debug_assert!(self.is_none()); None }
}

And finally we could implement for bool like so:

struct MyBool(bool);
impl<T> Carrier<MyBool> for MyBool {
    type Ok = ();
    fn is_ok(&self) -> bool { self.0 }
    fn unwrap_into_ok(self) -> Self::Ok { debug_assert!(self.0); () }
    fn unwrap_into_error(self) -> { debug_assert!(!self.0); self }
}

Now this version is more flexible. For example, we could allow interconversion between Option values to be converted to Result by adding an impl like:

impl<T> Carrier<Result<T,()>> for Option<T> { ... }

But of course we don't have to (and we wouldn't).

nikomatsakis commented 7 years ago

@Stebalien

FYI, rust-lang/rfcs#1450 (types for enum variants) would open up some interesting ways implement Carrier

As I was writing up that idea I just wrote, I was thinking about having types for enum variants, and how that might affect things.

One thing I noticed writing some code that uses ? is that it is mildly annoying not to have any kind of "throw" keyword -- in particular, if you write Err(foo)?, the compiler doesn't know that this will return, so you have to write return Err(foo). That's ok, but then you don't get the into() conversions without writing them yourself.

This comes up in cases like:

let value = if something_or_other() { foo } else { return Err(bar) };
nikomatsakis commented 7 years ago

Oh, I should add one other thing. The fact that we allow impls to influence type inference should mean that foo.iter().map().collect()?, in a context where the fn returns a Result<..>, I suspect no type annotations would be required, since if we know that the fn return type is Result, only one impl would potentially apply (locally, at least).

nikomatsakis commented 7 years ago

Oh, and, a slightly better version of my Carrier trait would probably be:

trait Carrier<Target> {
    type Ok;
    fn into_carrier(self) -> Result<Self::Ok, Target>;
}

where you would implement it like:

impl<T,U,E,F> Carrier<Result<U,F>> for Result<T,E>
    where E: Into<F>
{
    type Ok = T;
    fn into_carrier(self) -> Result<T, Result<U,F>> {
        match self { Ok(v) => Ok(v), Err(e) => Err(e.into()) }
    }
}

And expr? would generate code like:

match Carrier::into_carrier(expr) {
    Ok(v) => v,
    Err(e) => return e,
}

A downside (or upside...) of this of course is that the Into conversions are pushed into the impls, which means that people might not use them when it makes sense. But also means you can disable them if (for your particular type) that are not desired.

Stebalien commented 7 years ago

@nikomatsakis IMO, the trait should be IntoCarrier and IntoCarrier::into_carrier should return a Carrier (a new enum) instead of re-using result like this. That is:

enum Carrier<C, R> {
    Continue(C),
    Return(R),
}
trait IntoCarrier<Return> {
    type Continue;
    fn into_carrier(self) -> Carrier<Self::Continue, Return>;
}
nikomatsakis commented 7 years ago

@Stebalien sure, seems fine.

nrc commented 7 years ago

Nominating for discussion (and possible FCP of the ? operator alone) at the lang team meeting. I assume we need to land some kind of temporary Carrier trait in the next few days to FCP.

nrc commented 7 years ago

I opened rust-lang/rfcs#1718 to discuss the Carrier trait.

nikomatsakis commented 7 years ago

Hear ye, hear ye! The ? operator specifically is now entering final comment period. This discussion lasts for roughly this release cycle which began on August 18th. The inclination is to stabilize the ? operator.

With respect to the carrier trait, a temporary version landed in #35777 which should ensure that we have the freedom to decide either way by preventing undesired interaction with type inference.

@rust-lang/lang members, please check off your name to signal agreement. Leave a comment with concerns or objections. Others, please leave comments. Thanks!

nielsle commented 7 years ago

I wonder if tokio-based libraries will end up using and_then a lot. That would be one argument for letting foo().?bar() be shorthand for foo().and_then(move |v| v.bar()), so that Results and Futures can use the same notation.

seanmonstar commented 7 years ago

Just to be clear, this FCP is about the question_mark feature, not catch, correct? The title of this issue implies the tracking of both of those features in this one issue.

durka commented 7 years ago

@seanmonstar the latter hasn't even been implemented, so, yeah. Presumably if the FCP results in acceptance this will be changed to track catch.