golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
124.15k stars 17.69k forks source link

spec: guarantee non-nil return value from recover #25448

Closed bradfitz closed 1 year ago

bradfitz commented 6 years ago

Calling panic with a nil panic value is allowed in Go 1, but weird.

Almost all code checks for panics with:

     defer func() {
        if e := recover(); e != nil { ... }
     }()

... which is not correct in the case of panic(nil).

The proper way is more like:

     panicked := true
     defer func() {
        if panicked {
              e := recover()
              ...
        }
     }()
     ...
     panicked = false
     return
     ....
     panicked = false
     return

Proposal: make the runtime panic function promote its panic value from nil to something like a runtime.NilPanic global value of private, unassignable type:

package runtime

type nilPanic struct{}

// NilPanic is the value returned by recover when code panics with a nil value.
var NilPanic nilPanic

Probably Go2.

cznic commented 6 years ago

Dear runtime,

if I perform panic(nil) or panic(somethingThatCanBeNil), it may be a mistake. That's my problem. But I may also do that intentionally and with the magically changed value, I need to not forget about that and workaround the magic.

Less the magic, the less exceptional rules I have to think about. It makes me more productive and my code more comprehensible to my future self. Thanks.

edit: typo

mdempsky commented 6 years ago

An alternative solution would be to allow recover() to be used in ,ok assignments. For that to really solve the stated problem though, that would require all call sites to be updated.

My personal leaning at the moment is in favor of the proposal as stated.

mvdan commented 6 years ago

Is any program out there using nil panics intentionally? A simple search for panic(nil) doesn't give anything on my entire GOPATH besides a go/ssa/interp test. But I'm more worried about panics with variables that could/would be nil.

In any case, I agree with the sentiment and the proposed solution. Perhaps runtime.NilPanic should be clarified that it's only for untyped nils, though. For example, this case has a nil value but wouldn't be equal to nil when recovered:

var err *myError
panic(err)
cznic commented 6 years ago

FTR: I admit this is a real problem. I just prefer explicitly handling it.

robpike commented 6 years ago

Is this a real problem, though? I doubt it. What if the implementation of panic with a nil argument instead just did, panic("nil value passed to panic")? Thereby fixing the problem and diagnosing it one one stroke.

bradfitz commented 6 years ago

Is this a real problem, though? I doubt it.

I've had to deal with it at least twice. net/http had hangs when people panicked with nil.

@mpvl was talking about error handling the other day and was showing some examples of how defensive code should ideally look like (and how it's hard to get right), and he forgot the nil panic case, showing it's even harder than it seems.

What if the implementation of panic with a nil argument instead just did, panic("nil value passed to panic")? Thereby fixing the problem and diagnosing it one one stroke.

That's what I'm proposing, except with a variable (which could have a String method with that text). But I'm fine (but less fine) with it being that string value exactly, as matching on strings is a little gross.

wgoodall01 commented 6 years ago

This proposal makes total sense--for a language which prides itself on its simplicity and obviousness, it is perplexing that the only way to check for a panic(nil) in a recover is by using some other variable. I can't possibly think up a situation when someone would call panic(nil) on purpose anyhow.

deanveloper commented 6 years ago

While I believe that this is a very good thing to be able to handle, I think that a , ok pattern would be a bit better. If I panic(nil) then in a language like Go, I would want nil to be the recover() value.

So then recover would look like:

defer func() {
    if e, ok := recover(); ok {
        // do recovery stuff
    }
}();
cznic commented 6 years ago

@deanveloper I like your idea for being backwards compatible.

bcmills commented 6 years ago

The proper way is more like: […]

That turns out not to be correct either: runtime.Goexit can terminate the goroutine without a recoverable value, but that pattern (and the variant as written by @cznic) would incorrectly report a panic where no panic occurred. (https://play.golang.org/p/yS1A1c5csrR)

As far as I can tell, there is no way for a defer call to distinguish between panic(nil) and runtime.Goexit(), which to me suggests that at least one of them should be removed.

bcmills commented 6 years ago

As far as I can tell, there is no way for a defer call to distinguish between panic(nil) and runtime.Goexit(), which to me suggests that at least one of them should be removed.

As it turns out, we can use the fact that a runtime.Goexit() cannot be recover'd to distinguish it from panic(nil), via a “double defer sandwich”: https://play.golang.org/p/nlYYWPRO720

gopherbot commented 6 years ago

Change https://golang.org/cl/134395 mentions this issue: errgroup: rethink concurrency patterns

ianlancetaylor commented 6 years ago

I think we should treat panic(nil) as a runtime error, so it should panic with a value that implements the runtime.Error interface. This is less convenient for people who want to explicitly detect panic(nil), but I don't see why that matters; if you want to check for it, then, instead, don't do it.

deanveloper commented 6 years ago

if you want to check for it, then, instead, don't do it.

Perhaps you're using a badly-written library, and you don't have control over it. I've seen worse before :man_shrugging:

ghost commented 5 years ago

Our team spent 3 hours today hunting down what we now know to be a panic(nil) call. The problem is that panic(nil) is valid yet undetectable since recover() only returns the value that was panic-ed.

This proposal as-is would work as the common if recover() != nil { pattern would see a non-nil value. Therefore, has my vote. (Some internal type would work too, as long as it is not nil)

Making recover() return two values (the panic-ed value and whether a panic occurred) as mentioned several times would make more sense, However, changing recover() will undoubtedly be rejected for being non-backward compatible. Or too complicated if made magical (like map and range that provide one or two values, depending how you use it).

zigo101 commented 5 years ago

Sometimes. people may call panic(nil) for success. See the case 3 in this article for an example.

[edit] in this example, the panic value is not nil, but it can be.

bradfitz commented 5 years ago

@go101, eliminating code like that would be an added bonus.

zigo101 commented 5 years ago

I agree this pattern is some weird. But I think panic(nil) should be treated as an unexpected feature.

Another use case (also some weird, ;D):

func doSomething() (err error) {
    defer func() {
        err = recover()
    }()

    doStep1()
    doStep2()
    doStep3()
    doStep4()
    doStep5()

    return
}

// panic with nil for success and no needs to continue.
// panic with error for failure and no needs to contine.
// not panic to continue.
func doStepN() {...}

which is much less verbose than

func doSomething() (err error) {
    shouldContinue, err := doStep1()
    if !shouldContinue {
        return err
    }
    shouldContinue, err = doStep2()
    if !shouldContinue {
        return err
    }
    shouldContinue, err = doStep3()
    if !shouldContinue {
        return err
    }
    shouldContinue, err = doStep4()
    if !shouldContinue {
        return err
    }
    shouldContinue, err = doStep5()
    if !shouldContinue {
        return err
    }

    return
}

// if err is not nil, then shouldContinue must be true.
func doStepN() (shouldContinue bool, err error) {...}
bradfitz commented 5 years ago

@go101, this is quickly going off topic. All that code might be amusing to some, but it's not code we want to enable or encourage.

zigo101 commented 5 years ago

I agree.

neild commented 4 years ago

The proposal as-is could be conditionalized on the Go version specified in the go.mod file, to be be fully backwards-compatible.

bcmills commented 4 years ago

@neild, the go version specified in the go.mod file applies to individual packages, not the binary as a whole. What happens if a go 1.13 package passes nil as an interface{} to a go 1.15 package, which then invokes panic with that as its argument?

neild commented 4 years ago

@bcmills Simplest approach I can think of: panic always converts nil to runtime.NilPanic. In a pre-go1.15 package, recover converts it back into nil.

So if a go1.13 package passes (interface{})(nil) to a go1.15 package which panics, that's the same as panic(runtime.NilPanic). However, if a go1.13 package recovers from that panic, it sees nil.

stephens2424 commented 4 years ago

I came across the problem of panic(nil) the other day. I think I generally like both proposals here. Independent of those, however, I'd also propose we include checking panic values as a go vet check. I suspect a common reason this would occur in the wild is a copy/paste mistake like

x, err := doThing()
if err != nil {
    panic(err)
}
if x == nil {
    panic(err)
}

I've seen code along these lines before, when dealing with odd cases where both return values of a function might be nil in some case. The second panic was copied to quickly handle x being nil after realizing the possibility, but not taking care to switch to a new value to panic with.

While doing one of the proposals above would be great for robustness, I think a vet check would help catch the simple mistakes like this one. I've got a sketched out change that adds it as a step to the existing package nilness. It seems like a good fit: it already built up all the requisite facts, so the work was just looking at the panics. The tests I've written seem like it detects the common cases I'd care about, as well.

The change is here: https://github.com/stephens2424/tools/commit/3817b808a2d7b6515559b16a088aa978aa7d8598

If the idea sounds good, I'll tidy it up into a real CL.

josharian commented 4 years ago

@stephens2424 seems promising. If it isn't too much work, please do mail a CL.

gopherbot commented 4 years ago

Change https://golang.org/cl/220777 mentions this issue: go/analysis/passes/nilness: detecting panic with provably nil values

stephens2424 commented 4 years ago

I've uploaded the change, cleaned up a bit and with a real commit message. In the past few days, I've realized a couple points, however.

First, as is, this change will not be included by go vet. I see some information that suggests it might be available in golangci-lint, and less convincingly, in gopls and some odd copy of go vet. (As an aside, the difference between importers in godoc.org and pkg.go.dev is interesting.) I gather that the reason this analysis pass is not in go vet is because the SSA form is too expensive to compute, and I doubt this new feature adds enough value to tip that scale. I think this new feature makes sense to exist in the pass, though, and it'll get picked up by tools if they do decide to add this analyzer.

Second, it occurred to me that, if it's practical, reversing this analyzer, making it much more strict, could be an interesting alternative approach. That is to say: the analyzer would require that panic values be provably non-nil, requiring authors to add specific checks when there's any ambiguity. If it works, I think it could be a viable solution to this issue on its own, or possibly with some attention to getting wider adoption of the analyzer. I'm still new to working with the SSA form, though, so my experimentation with the idea is a little slow going, and I might be getting ahead of myself with this even being possible. Fun to learn it though! It seems very powerful!

darkfeline commented 4 years ago

I have a slightly ugly suggestion for this. Add a new builtin:

func recover2() (v interface{}, ok bool)

This is a backward compatible change, and provides a way for newly written code to be able to check whether a panic(nil) happened.

Edit to add more exposition:

I don't think it's possible to retroactively fix existing code using recover/panic "incorrectly" without breaking the backward compatibility promise. It's perfectly valid according to the Go spec to write code that deliberately calls panic(nil) and handles that case accordingly with recover() != nil. It might be in poor taste, but the behavior is unambiguously defined in the spec. We cannot change this unless we guarantee that no code has ever done this (impossible) or break users of well-defined Go behavior (really bad PR for the backward compatibility promise).

I think this case is similar to time.Timer.Reset, whose return value cannot be used correctly. A new recommendation was added, but we didn't try to retroactively fix broken code.

mdempsky commented 4 years ago

@darkfeline Why add recover2 instead of just allowing err, ok := recover()?

darkfeline commented 4 years ago

@mdempsky I'm not sure that's possible, given that recover is a normal function, not a keyword. If it is possible, then allowing recover() to have one or two return values depending on context would be better, yes.

Edit: See https://github.com/golang/go/blob/master/src/runtime/panic.go#L1080 I think it'd be difficult to overload recover().

deanveloper commented 4 years ago

@darkfeline it's not a normal function, it's a builtin function. Builtin functions often have special behaviors that cannot be represented by normal functions (ie make and new can take a type as an argument)

The most obvious solution would be to have both a runtime.gorecover and runtime.gorecover2, and the compiler can easily replace the recover() call with either gorecover or gorecover2 depending on the lvalue(s), similarly to how mymap[key] is changed to runtime.mapaccess1 or runtime.mapaccess2 depending on the lvalue(s).

One potential issue to this is documentation in the builtin package, I'm not exactly sure what the "signature" of builtin.recover would look like. Although that may be only a minor issue.

mdempsky commented 4 years ago

I'm not sure that's possible, given that recover is a normal function, not a keyword.

Yes, it's possible. Recover is a built-in function, and built-in functions have whatever behavior we specify for them. Most of the built-in functions have some behavior that couldn't be emulated with a normal function.

darkfeline commented 4 years ago

@deanveloper I see. I didn't realize you couldn't do this, although it's obvious in hindsight.

bcmills commented 4 years ago

@deanveloper, I see two (related) problems with the , ok approach:

  1. A whole lot of Go code has already been written without it, and generally the authors of that existing code were not expecting recover to return nil during a panic.

  2. I would bet that existing Go developers will forget that it exists, and therefore not use it, leading to the same problem as (1) in new code.

So I think the , ok approach is only viable if we remove the non-, ok form from the language entirely, and that seems like a more drastic remedy than just disallowing panic(nil) (which is almost never used intentionally today).

darkfeline commented 4 years ago

@bcmills What do you think of my analysis in https://github.com/golang/go/issues/25448#issuecomment-596190187?

I don't think not being able to fix existing code means we should not add a straightforward way to correctly handle this case for new code.

Also, there are many cases where using a single valued recover is correct:

func foo() {
    defer func() {
        v := recover()
        if v, ok := v.(myError); ok {
            // do stuff
        }
    }()
    // do stuff
}
bcmills commented 4 years ago

@darkfeline, I think that analysis overlooks @neild's insight in https://github.com/golang/go/issues/25448#issuecomment-565220299.

New code could set a new go version in its go.mod file, and check for v == runtime.NilPanic (which should be rare anyway), which would still give intuitive behavior for the simpler v != nil test to mean “was there a panic?” (as folks empirically assume it means today).

That wouldn't necessarily leave a way for new code to produce a nil recover() result in legacy callers, but I would expect the cases where that is necessary to be vanishingly rare anyway: the main use-case of panic(nil) today is likely to sneak past existing (buggy) nil-checks, and I think we should encourage upstream fixes rather than sneaky workarounds.

bcmills commented 4 years ago

And, FWIW, I don't think that example of a single-valued recover is actually “correct”: if you recover an error other than myError, that code will silently swallow it rather than propagating it, which may leave the program in an arbitrarily corrupted state (especially if the panic was due to an unexpected nil-dereference).

A more-correct version typically looks something like:

func foo() {
    defer func() {
        v := recover()
        if v, ok := v.(myError); ok {
            // do stuff
        } else if v != nil {
            panic(v)  // lost the stack trace, but at least we can propagate the panic... 🤷
        }
    }()
    // do stuff
}

But then we're back to needing to distinguish panic(nil) from other cases.

neild commented 4 years ago

I think the simplest fix is what I proposed in https://github.com/golang/go/issues/25448#issuecomment-565220299:

If the Go version in go.mod is greater than some version, then recover returns runtime.NilPanic instead of nil when recovering from panic(nil). The only user-visible change is to the recover function.

If you want to panic(nil), you still can. If you want to handle nil panics, you can by explicitly checking for runtime.NilPanic. So long as your Go version is recent, you get a guarantee that recover returns nil iff there was no panic. Older code is unaffected.

macklin-10x commented 2 years ago

Any chance this fix will see the light of day? I burned something like 4+ hours debugging what turned out to be a panic that was passed a nil interface...

deanveloper commented 2 years ago

Another problem I have realized with runtime.NilPanic: the ways that panic-recover is detailed in the spec, so how would we document how panic(nil) works? The spec shouldn't depend on the standard library. Perhaps we could attach Implementation detail: If a panic is called with a nil argument, the return value of recover may be a sentinel value that can be found in the standard library, usually at runtime.NilPanic. I'm not really a fan of this solution, though.

@bcmills

I see two (related) problems with the , ok approach:

  1. A whole lot of Go code has already been written without it, and generally the authors of that existing code were not expecting recover to return nil during a panic.

I don't really see this as an issue - we can't reasonably go about "fixing" already written code.

  1. I would bet that existing Go developers will forget that it exists, and therefore not use it, leading to the same problem as (1) in new code.

This could be solved by putting this in go-lint (and maybe eventually vet?), and it would be relatively easy to add to go fix as well.

neild commented 2 years ago

The spec shouldn't depend on the standard library.

The spec already mentions the runtime package: "Execution errors such as attempting to index an array out of bounds trigger a run-time panic equivalent to a call of the built-in function panic with a value of the implementation-defined interface type runtime.Error."

https://go.dev/ref/spec#Run_time_panics

My proposed change would be something like this in the section https://go.dev/ref/spec#Handling_panics: "If panic's argument was nil, the return value of recover is the implementation-defined value runtime.NilPanic. This type satisfies the predeclared interface type error."

gopherbot commented 2 years ago

Change https://go.dev/cl/416555 mentions this issue: errgroup: propagate panics and goexits from goroutines in the group

rsc commented 1 year ago

Retitled to reflect @neild's suggestion most recently in https://github.com/golang/go/issues/25448#issuecomment-599760022: if go.mod indicates a new enough Go version, then recover() in that module always returns a non-nil value. If there is an active panic(nil), recover returns runtime.NilPanic.

If we do make this change, it seems like perhaps we should change the result of recover() during runtime.Goexit as well, so that recover (called from a deferred function) returning nil guarantees that the function that did the defer is returning normally.

bcmills commented 1 year ago

If we do make this change, it seems like perhaps we should change the result of recover() during runtime.Goexit as well,

I don't think that we should change recover to return a non-nil error during runtime.Goexit.

Consider the existing code following the pattern in https://github.com/golang/go/issues/25448#issuecomment-599755154. Changing runtime.Goexit to return a non-nil error in case of runtime.Goexit would cause a function like that one to panic if the function body calls testing.TB.SkipNow, whereas previously it would have successfully skipped the test.

bcmills commented 1 year ago

(In an out-of-band conversation, @aclements suggested that perhaps we could add a func Goexiting() bool to package runtime to more easily detect that condition, but that seems like it should be a separate proposal.)

rsc commented 1 year ago

Instead of keying off go.mod to make the change, a strategy more in keeping with our existing practice would be to simply make the change but have a GODEBUG to turn it off, with perhaps that GODEBUG set automatically when the go version is too old. That's kind of the opposite of what Damien suggested but it has a clearer expectation for what happens in old versions. This approach would only be reasonable if we were confident that panic(nil) happens exceedingly rarely in existing code, so that changing the behavior would break very little. Do we have any data about that?

neild commented 1 year ago

The nice thing about keying off go.mod is that the decision can be made on a per-module basis. GODEBUG is program-wide.

DeedleFake commented 1 year ago

As @neild said, if it's left to GODEBUG, it could result in impossible-to-resolve breakage because of incompatibilities between modules. You might have one dependency that expects it to work one way and one that expects it to work another way. This is a language change and it seems to me that it should be tied, via the go.mod file, to the version of the language in which the change happened. As far as I know, changes like this are pretty much the reason that the go <version> directive exists.

rsc commented 1 year ago

The nice thing about GODEBUG is we can do it now. If we key off go.mod in this way then really we should be waiting for #55092.

neild commented 1 year ago

What makes GODEBUG better than go.mod for use now? Without #55092, both are advisory and silently fail to take effect on older Go versions. Even with #55092, a version in go.mod will silently fail to take affect on versions predating toolchain-upgrade support.