proposal: cmd/vet: check for missing Err calls for bufio.Scanner and sql.Rows

mdlayher commented 7 years ago

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (`go version`)?

go version go1.7.3 linux/amd64

What operating system and processor architecture are you using (`go env`)?

GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/matt"
GORACE=""
GOROOT="/home/matt/.gvm/gos/go1.7.3"
GOTOOLDIR="/home/matt/.gvm/gos/go1.7.3/pkg/tool/linux_amd64"
CC="gcc"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build698545690=/tmp/go-build -gno-record-gcc-switches"
CXX="g++"
CGO_ENABLED="1"

What did you do?

If possible, provide a recipe for reproducing the error. A complete runnable program is good. A link on play.golang.org is best.

https://play.golang.org/p/zvrWRVgw4M

Used a bufio.Scanner without checking bufio.Scanner.Err() for error. The same mistake occurs frequently with sql.Rows.Err().

What did you expect to see?

go vet should warn that s.Err() was not called, and that it should be checked to determine if an error occurred while scanning.

This frequently occurs in our internal codebase at work. Arguably, go vet could check if any type has a method Err() error and report if the error is not checked or returned, instead of just checking the special cases with bufio.Scanner and sql.Rows. There may also be more types that use this pattern in the standard library.

What did you see instead?

No warnings from go vet.

josharian commented 7 years ago

There are legitimate cases in which you might elide the s.Err check, such as when reading lines from a strings.Reader; we know a priori that that cannot fail. False positives from vet are painful.

This frequently occurs in our internal codebase at work.

How frequently is frequently? How bad are the resulting bugs?

cc @robpike

mdlayher commented 7 years ago

How frequently is frequently? How bad are the resulting bugs?

It probably comes up in code review at least once or twice per week. Typically when interacting with a database.

As far as the severity of the resulting bugs, I am not sure. I'm on a different team and often just get asked to do code review sporadically, even though I typically don't maintain the projects where these problems occur.

guodongli-google commented 2 years ago

Perhaps a more proper place to implement this check is in the DeepGo tool since there is already a similar checker that handles missing calls on a resource. Specifically, it is not trivial to accurately check whether Err() should be called. Consider these cases:

"bufio.Scanner" escapes the function

func f(t *T) {
    s := bufio.NewScanner(r)
    ...
    t.s = s
}

"Err()" is called somewhere else

func f(t *T) {
s := bufio.NewScanner(r)
...
if g(s) {
   ...
}
}
func g(s *Scanner) bool {
if s.Err() {
    return true
}
...
}

rsc commented 2 years ago

How often do the things with Err methods get very complicated? It seems like basic bottom-up propagation like we do for identifying fmt.Printf wrappers might suffice for this check.

A bigger question is whether something with an Err method should always have that method called.

rsc commented 2 years ago

This proposal has been added to the active column of the proposals project and will now be reviewed at the weekly proposal review meetings. — rsc for the proposal review group

timothy-king commented 2 years ago

A bigger question is whether something with an Err method should always have that method called.

The bufio.NewScanner over a strings.Reader example given above makes me think we are unlikely to get a vet checker that applies to all types with an Err method. We may be able to make progress on more a restrictive set of checks like targeting bufio.Scanner or sql.Rows.

It seems like basic bottom-up propagation like we do for identifying fmt.Printf wrappers might suffice for this check.

Intraprocedural analysis seems sufficient for inferring that a NewScanner is called in the function, the value never escapes, and has no calls to Err(). Bottom up analysis is also going to be able to tell if the error return value in a (io.Reader).Read implementation is always the constant nil or an io.EOF (which strings.Reader can return). This would expand the set of io.Readers this can reason about, but this may be of marginal value.

The challenge for checking bufio.Scanner is one appears to need to know the io.Reader type passed to NewScanner. Passing an io.Reader in function arguments is a common way to generalize streams of bytes func foo(r io.Reader) { s := bufio.NewScanner(r); .... We can likely shoehorn a small amount of information from the caller by summarizing the functions using a bottom-up analysis (the summary I am imagining would be roughly 'if argument # X must be such and such a type, then there is an error on such and such line with this message"). I don't think this will be fast enough for vet beyond exactly 1 static call. If this seems too challenging for the gain, another option is that vet does not have to be this general in v0 and can focus on purely intraprocedural bugs [at least initially].

guodongli-google commented 2 years ago

If we cover only a predefined set of types such as bufio.Scanner then this looks like another lostcancel checker to me although now Err() an implicit "cancel" function that should be called. This missing Err() checker is more complicated though, since the buffer object may be passed around to many places including transitive callees, while the explicit cancel() function usually will not.

timothy-king commented 2 years ago

This missing Err() checker is more complicated though, since the buffer object may be passed around to many places including transitive callees, while the explicit cancel() function usually will not.

The small sample of bufio.Scanner examples I took a look at do not feature a Scanner that can syntactically escape. Do we have evidence about the frequency of Scanners that are passed around/might escape?

guodongli-google commented 2 years ago

The most common patterns I see in a few examples are:

    scanner := bufio.NewScanner(text)
    for scanner.Scan() { ... }
    if err := scanner.Err(); err != nil { ... }

and

    s := bufio.NewScanner(bytes.NewBuffer(b))
    if !s.Scan() {
        if err := s.Err(); err != nil {
            return nil, err
        }
        ...
    }

These are subject to a simple vet checker. However some examples make me wonder whether Err() should always be called, e.g.,

func main() {
    stdin := bufio.NewScanner(os.Stdin)
    for stdin.Scan() {
            p := stdin.Text()
        if strings.Contains(p, ...) { ... }
    }
}

and

func f(...) bool {
    scanner := bufio.NewScanner(&buf)
    for scanner.Scan() {
                fmt.Fprintf(..., scanner.Bytes())
    }
        return found
}

If "f()" return (..., error) then we should definitely check the Err() and return it. If the function doesn't care about the error and just does its best, then Err() may be skipped (although it is still good to have)?

timothy-king commented 2 years ago

  stdin := bufio.NewScanner(os.Stdin)

I am curious whether others think bufio.NewScanner(os.Stdin) might not require an Err() call.

func f(...) bool {
  scanner := bufio.NewScanner(&buf)
  for scanner.Scan() {
                fmt.Fprintf(..., scanner.Bytes())
  }
        return found
}
Without more context about &buf, I am not sure what I think about this example. What is the type of buf?

guodongli-google commented 2 years ago

Without more context about &buf, I am not sure what I think about this example. What is the type of buf?

buf is a memory buffer created using bytes.NewBuffer(). Among the 5 examples I've seen using such buffer, 4 of them do not check Err(). Among the other 5 examples that use os.File, 1-2 do not check Err().

I still think that this is a valid checker, except that the check may need to be stricter. For the case whether the enclosing function return an error, the Err() should be checked and returned.

rsc commented 2 years ago

bufio.Scanner can return errors no matter what kind of reader is passed into it. That is, I don't think we need to know anything about the type passed to bufio.NewScanner.

It still sounds like this is doable with bottom-up summaries, since we have no interfaces with Err() error methods, so there's very little uncertainty about what is going on in any given function.

The question remains: how many error reports would this produce, and are they useful?

rogpeppe commented 2 years ago

bufio.Scanner can return errors no matter what kind of reader is passed into it.

That's an excellent point that I hadn't realised (the specific error it can return is ErrTooLong). In the light of that, I support this proposal - I've definitely left out calls to Err in the past that I should have made.

timothy-king commented 2 years ago

I took another look at ~20 examples that have no Err() call. Most look like the following in golint. They appear to be unaware of ErrTooLong. It is possible these folks are aware of this limit, and their inputs happen to always obey them. My personal suspicion is that it is not a high %.

The cases I saw that are odd balls that plausibly might not bugs are:

Called for just a side effect on os.Stdin: bufio.NewScanner(os.Stdin).Scan(). (Seems like a bug but I'm not sure.)

And comparing one Text() against one string constant that is much shorter than 64k.

s bufio.NewScanner(s)
s.Scan()
if s.Text() != "all is well to turn on feature" {
fmt.Fatalln("Did not enter handshake constant") // I am exaggerating a bit but this is roughly what is going on
}
...

Neither is very compelling in the face of how many likely bugs I am seeing.

rsc commented 2 years ago

OK, so the question is should we do this just for bufio.Scanner? Probably 'everything with an Err method' is too much? It seems like bufio.Scanner is not special enough to warrant its own checker all by itself.

(Much the same way the implicit methods check is generalized and not one for json, one for gob, etc.)

mdlayher commented 2 years ago

For what it's worth, I still think this should be applied to sql.Rows at very least as well. That and bufio.Scanner are the most common misuses I see in stdlib.

But as for the generalizing to any Err() error after iteration, I'd still be +1 as I have mimicked the scanner APIs from the stdlib and require an error check for those APIs in my own libraries in the same way.

twmb commented 2 years ago

I've repeatedly ignored the error in the past when, if my scanner errors, I don't care and either exit the program (more common) or accept the best-effort of what was processed (less common). It'd be a little annoying to be forced to write:

if scanner.Err() != nil {
        // don't care
}

in these patterns.

To me, considering it is sometimes ok to ignore this error, forcing the error to be checked seems like a job better suited for an external linter. The documentation on Scan mentions Err twice. I'm not sure how scanner.Err() is more special than any other function that returns errors -- even though the error checking is one extra function, I can just as easily accidentally ignore errors on other far more important function calls.

mdlayher commented 2 years ago

If purposely ignoring the error is your intent, I see nothing wrong with the following to make that intent clear:

s := bufio.NewScanner(r)
for s.Next() {
   // ...
}
_ = s.Err()
return nil

twmb commented 2 years ago

Can't that same thought be applied to any error? If so, wouldn't errcheck be a fine linter to add to a CI pipeline, as well as rowserrcheck?

In a way, the thought of ignoring or not this particular error feels reminiscent to https://www.joeshaw.org/dont-defer-close-on-writable-files/.

I know that past issues about go vet sometimes revolve around the thought of no false positives. To me, forcing an error to be checked when I know it can be ignored is a false positive.

robpike commented 2 years ago

@twnb This has been suggested before, but there are too many instances of functions whose errors are not idomiatically checked (Printf is the canonical example) and others where the error is impossible but is present to satisfy an interface. It was deemed too messy in the implementation of the checker to keep programmer style consistent, or else too messy for programmers to put the underscores everywhere. No one wants the equivalent of the ugly (void) casts the C linter foisted on everyone in the 1980s.

twnb commented 2 years ago

这是来自QQ邮箱的假期自动回复邮件。您好，我最近正在休假中，无法亲自回复您的邮件。我将在假期结束后，尽快给您回复。

timothy-king commented 2 years ago

A reasonable reduction in scope for bufio.Scanner is to restrict it to the cases where the result of s.Scan() is used. The only plausible false positives I have seen thus far also happen to ignore this value. This have an okay "feel" to me as if one care about when Scan() returns false one also cares about Err() [at least that is my intuition].

timothy-king commented 2 years ago

I've repeatedly ignored the error in the past when, if my scanner errors, I don't care and either exit the program (more common) or accept the best-effort of what was processed (less common)

@twmb [or anyone else] Can you point to a real world example where you made either of these decisions? (Or provide an abstraction if it is proprietary?) The evidence I have looked at so far is a few dozen examples. All of this is code I did not write. I cannot see into the minds of what the authors were aware of or thinking. My best guess is that all of the ones I have looked at have a rare bug that is sneaking by. A real world example where the discussed check on bufio.Scanner would have a false positive would be evidence against going forward with it in cmd/vet.

twmb commented 2 years ago

Out of the repos I have lying that use bufio.NewScanner,

checks error:
  - 12 check error and return error
  - 5 check error and log
  - 1 dev tool that os.Exit(1)'s on err
  - 4 tool that log.Fatal's on non-nil scanner.Err()
deliberately ignores error:
  - 1 best effort reading lines, ok to fail whenever
  - 1 local dev program parsing dev-defined input
  - 1 parsing unknown input that was fully read into memory and size limited already
  - 1 cli tool parsing one line of stdin, error ignored but confirmation of input printed
  - 1 tool that looks for an input line in a file: I was going to throw this into the "potentially should" check below, but os.Open's error is deliberately checked and returns false as well; looks like any failure scenario is assumed false
  - 1 same tool^, prompt for one line of stdin, anything other than "yes" is not accepted
  - 1 separate tool, same pattern (prompt for one character, anything else is failure)
  - 1 tool that parses a single line of output and expects an exact match of a short string; anything not matching is rejected
  - 1 tool that re-formats a small block of already-in-memory text
  - 1 tool that parses the first line and looks for a specific character to deduce a format; any non-match exits the program with a non-scanner related error
  - 1 tool that is reading a small (<1k) file and looking for a specific prefix on a line, any failure is non-match that has a separate non-scanner related error
  - 2 tool parsing a well-defined file looking for a specific line, any failure is considered a failure
does not check error but potentially could:
  - 1 count lines in file
  - 1 cli tool parsing two lines of stdin, assuming success, not prompting for confirmation (input saved to file; ignoring error questionable here, perhaps fine)
  - 1 tool that attaches to a subcommand and forwards all lines to stdout; this could check and log that scanning the subprocess failed (I'm not sure if it's intentional that the error is ignored)
  - 1 program parsing input to calculate something

The most common trend for ignoring the error looks to be CLI tools that either (a) parse a single small line from stdin and fails if the input is anything other than an explicit "yes" or something, or (b) parse a well-defined small file for a specific line, where any non-match results in a separate error such that the scanner error unimportant. Less common are the few instances where best-effort parsing is fine, and the other instances where the scanner is parsing a small block of defined-in-program, size-already-validated chunk of text.

FWIW there were more deliberate ignores than I was expecting -- I only remembered my own prior use cases (a very small amount of times in 10 years). I have a bunch of cli tools cloned. Of the four instances of code where the error potentially should be checked, one was obviously a bug, but looking at the surrounding code, the result will be logs that calculations are going wrong. Ultimately, whether this vet is accepted or not will not fundamentally affect me. I do question just how common it is that people are truly forgetting to check the error such that a hardline stance should be taken to block any instance of ignoring the error.

timothy-king commented 2 years ago

@twmb I don't think I have enough context to make a judgement call on the "deliberately ignores error" examples you have listed. Can you share or point us at some of the "deliberately ignores error" examples? If it is not possible to provide the real version of that, is it possible to provide an abstracted version without proprietary details?

twmb commented 2 years ago

Here's one: https://github.com/fastly/cli/blob/6150a454861ec683eb81d3566336b9aa57e34e50/pkg/commands/compute/language_rust.go#L332-L349 This looks for a specific line, if it isn't found (no matter the cause), a separate, non-scanner related error is returned. I can see this quickly becoming a "I'd write that differently, so the example could be better" type of situation (I probably would write that chunk of code to include the scanner error). There are probably better examples, but this is the first one I saw when I re-looked at the 42 files I had opened yesterday. I meant to provide input that it is OK and sometimes a deliberate decision to ignore the error, and I still wonder how frequently this is accidentally forgotten such that forcing a check would actually be fixing things (I'll note that the first followup comment in this thread voices exactly my thoughts). I do not plan on spending more time re-going through the other examples again, as this has already been more time than I care to spend on something that, really in either direction, will be non-impacting to me.

earthboundkid commented 2 years ago

I still wonder how frequently this is accidentally forgotten such that forcing a check would actually be fixing things

I saw multiple complaints about forgetting the check on social media in a short period, so I opened #51299. I don't have a good way to check a corpus, but my belief is that it is a fairly common mistake.

timothy-king commented 2 years ago

Thank you for the example! This really moves the conversation forward.

I'll try to condense this down to what I think its core is.

func find(needle, haystack string) error { // returns an error if found
    scanner := bufio.NewScanner(strings.NewReader(s))
    scanner.Split(bufio.ScanWords)
    for scanner.Scan() {
        if scanner.Text() == needle { // this could really be any predicate on Text()
            return nil
        }
    }
    return fmt.Errorf("did not find needle (%q) in haystack (%q)", needle, haystack)
}

Basically whenever scanners.Scan() returns false the outer function returns a non-nil error regardless of the reason Scan() returns false. (The real example has more error conditions so error is reasonable as a return type.)

This is semi-plausible as a false positive. Satisfying the checker is imposing some cost here.

Personally I do not think the costs of asking for .Err() are super high in my simplified version or the one in https://github.com/fastly/cli/blob/6150a454861ec683eb81d3566336b9aa57e34e50/pkg/commands/compute/language_rust.go#L332-L349 . Both remidiations as _ = scanner.Err() (documents failure caused is understood and suppressed) or as if err := scanner.Err(); err != nil { return err } (more precise reason in both cases). But cost>0 here.

I am not convinced we are out an acceptable cost-benefit tradeoff for a vet checker yet. I think we need more examples and evidence though. Might require a more systematic approach? In the meantime, if there are other real world examples folks think might be false positives those would really help.

rsc commented 2 years ago

@timothy-king, it sounds like we should investigate the effect of checking for sql.Rows's Err method as well. That would be a nice second instance that we could treat as evidence for a general checker (but still limited to known types).

Merovius commented 2 years ago

To me, the difference between Close() error and Err() error is exactly that the former says "you need to call this and check for errors", while the latter says "you can call this to get an error". So, IMO, if we need a checker that Err was called, the method should've been called Close to begin with.

I'd be far more happy with a generic io.Closer check and adding a Close method to sql.Rows. If we don't want to pollute the API with a second method (which has to do the same thing, for backwards compatibility), a check for specific known types might be fine.

But a generic check for Err() error is IMO counterproductive, as we already have a well-known way to signal that a method must be called for error checking: Call it Close.

bcmills commented 2 years ago

@Merovius, I don't think the analogy to Close quite fits. An error from Close means “something you wrote might not have been flushed”, whereas an error from an Err method means “you might not have read everything”. (They're dual.)

The Err methods on bufio.Scanner and sql.Rows are pretty clearly the latter, not the former. (Note that Rows also already has a Close method, which releases the resources in the event of an early return.)

It's safe to ignore an error from Close if you didn't write anything (for example, if you only read things), but in general you may have to call it in order to release resources.

It's safe to ignore an error from Err() if you saw everything you needed and the rest of the contents aren't relevant. (But if that's the case, you probably shouldn't have called Scan after you found what you were looking for, either...)

zpavlinovic commented 2 years ago

@Merovius @bcmills My intuition tells me that scanner.Scan() in principle has a (bool, error) return type. However, this prevents the use of common pattern

for scanner.Scan() {
  ...
}

so the API is sort of split, where the return type of Scan() is bool and error is provided by Err(). So, b, _ := scanner.Scan() would be equivalent to b := scanner.Scan(); _ := scanner.Err(), which is different than not calling Err() at all. My intuition, of course, might be wrong.

Merovius commented 2 years ago

@bcmills Hm. I'm not sure I agree. I quickly grepped through the stdlib and there are three Err() error methods: In addition to the two we talked about, conforming to the "it's for reading" heuristic, there's scanner.ErrorList. But, importantly, there are many examples for Close() error which have nothing to do with writing. For example: io.PipeReader, syscall.Token (on windows), http.Response.Body and golang.org/x/exp/mmap.ReaderAt.

To me, "Close is something that must be called" seems a pretty consistent pattern (though not universal - e.g. PipeReader.Close doesn't have to be called). It's definitely firmly lodged into my brain. Err() error doesn't really have enough data-points to suggest a pattern (at least not in the stdlib), but at least the existence of scanner.ErrorList.Err seems to suggest that calling Err isn't really required.

Maybe the rule-ish is "Close has side-effects on the state of the type, whereas Err purely inspects it".

In any case. I still don't like this check. Personally, I just really dislike having to write _ = x.Err() or other somesuch patterns, just to pacify a linter.

earthboundkid commented 2 years ago

There could be a variant on bufio.Scanner with no Err method for use when you know you don’t care about the error. 🚲 bufio.LaxScanner?

rsc commented 2 years ago

FWIW @Merovius I think of Close's error as only being about failed writes before the close. You shouldn't have to check Close's result on a read-only stream, and that's what most of these are.

rsc commented 2 years ago

It sounds like it would be reasonable to write a checker for bufio.Scanner and sql.Rows not having their Err methods called. We can see what the precision is on the latter, but the former seems OK.

Does anyone object to moving forward with this?

mdlayher commented 2 years ago

It sounds like it would be reasonable to write a checker for bufio.Scanner and sql.Rows not having their Err methods called.

These are the two most common stdlib misuses I have run into, so SGTM.

rsc commented 2 years ago

Based on the discussion above, this proposal seems like a likely accept. — rsc for the proposal review group

timothy-king commented 2 years ago

I would like to collect more evidence before we make a decision here. I can volunteer to do this. Specifically I want to look at a few dozen more scanner.Err() cases for false positive prevalence and a couple dozen sql.Rows's Err cases. It may be a couple of weeks before I get to it though.

twnb commented 1 year ago

这是来自QQ邮箱的假期自动回复邮件。您好，我最近正在休假中，无法亲自回复您的邮件。我将在假期结束后，尽快给您回复。

earthboundkid commented 1 year ago

Is this still on hold?

timothy-king commented 1 year ago

I have not yet managed to collect the evidence I was hoping to. This may still take a while as we working on better ways of systematically testing such questions.

PleasingFungus commented 1 year ago

We recently discovered that missing calls to bufio.Scanner.Err() and sql.Rows.Err() were widespread in our codebase (appeared in ~10% of uses). Auditing our use of those types in our codebase, in all ~100 cases in which we called bufio.Scanner.Scan()/sql.Rows.Next(), it was correct for us to check the result of Err() afterward.

This check (or something similar) would be very helpful for us.

twnb commented 1 year ago

这是来自QQ邮箱的假期自动回复邮件。您好，我最近正在休假中，无法亲自回复您的邮件。我将在假期结束后，尽快给您回复。

golang / go