golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.97k stars 17.67k forks source link

x/tools/gopls: stop logging context cancellation errors #34103

Closed myitcv closed 4 years ago

myitcv commented 5 years ago

What version of Go are you using (go version)?

$ go version
go version devel +9da7abd2eb Tue Sep 3 18:35:32 2019 +0000 linux/amd64
$ go list -m golang.org/x/tools
golang.org/x/tools v0.0.0-20190905035308-adb45749da8e
$ go list -m golang.org/x/tools/gopls
golang.org/x/tools/gopls v0.1.4-0.20190905035308-adb45749da8e

Does this issue reproduce with the latest release?

Yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE="on"
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/myitcv/.cache/go-build"
GOENV="/home/myitcv/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/myitcv/gostuff"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/home/myitcv/gos"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/home/myitcv/gos/pkg/tool/linux_amd64"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/home/myitcv/gostuff/src/github.com/myitcv/govim/go.mod"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build307597040=/tmp/go-build -gno-record-gcc-switches"

What did you do?

The symptoms look similar to https://github.com/golang/go/issues/33647. However, whereas before I could reproduce this reliably, now it happens much more sporadically.

In govim we handle changes to (generated) files not loaded in the editing change via a file watcher (whilst we await https://github.com/golang/go/issues/31553)

The test failures we are seeing manifest with the following errors on gopls's stderr:

2019/09/05 08:41:07 failed to check package: context canceled
2019/09/05 08:41:07 failed to compute diagnostics: no CheckPackageHandles for file:///tmp/go-test-script292517338/script-complete_watched/main.go

These errors then get reported via the LogMessage callback.

https://go-review.googlesource.com/c/tools/+/192719 improved this situation, however following that CL we still see this ~10% of the time.

Unfortunately we haven't yet been able to capture a gopls log for this, hence here is the log of calls from govim's perspective:

https://gist.github.com/myitcv/6a5550365272416fb4d8673ff7fa8460

What did you expect to see?

No errors.

What did you see instead?

The above errors.


cc @stamblerre @ianthehat @muirrn

FYI @leitzler

myitcv commented 5 years ago

We're also seeing these errors as part of our hover tests.

So whilst they might be legitimate errors, is it also possible that we're spuriously logging non-error events as errors?

myitcv commented 5 years ago

Hmm, now we're seeing it as part of our "go to def" tests. The call to Definition failed with:

no CheckPackageHandles for file://$WORK/q/q.go

It's unclear whether the other "errors" were in fact errors, or whether the user would have noticed.

But in this case the user would definitely have noticed: the call to Definition returned an error when it should not have done.

muirdm commented 5 years ago

I occasionally see this myself, but never reproducibly. I haven't been able to dig into it, but I have a couple ideas of what might cause this.

It's unclear whether the other "errors" were in fact errors, or whether the user would have noticed.

In general context.Canceled shouldn't be logged as an error since it is expected, but given our track record for handling cancelation properly, we should probably leave the logs in for now.

gopherbot commented 5 years ago

Change https://golang.org/cl/193480 mentions this issue: internal/lsp: avoid using the importer's context as much as possible

myitcv commented 5 years ago

I occasionally see this myself, but never reproducibly

I can fairly reliably cause it to happen by repeatedly running the govim tests, but never on demand for a given run/test. So if there's some additional logging/whatever that will help to track this down we could create a CL and see how far we get.

we should probably leave the logs in for now.

Understood.

At the moment we're holding off upgrading gopls/x/tools because of this. https://go-review.googlesource.com/c/tools/+/192719 already improved the situation massively; this issue is what remains. Is the general feeling therefore that we should upgrade and manage these test "flakes", continuing to try and track down the root cause? That's a question for you, @muirrn, as well as @stamblerre and @ianthehat

stamblerre commented 5 years ago

I don't think I would feel comfortable tagging a new version (and in the same vein, having govim use an updated version), until we aren't seeing these issues anymore.

myitcv commented 5 years ago

Just tried against 2a03e9e (merged result of https://go-review.googlesource.com/c/tools/+/193480/) and got a failure. Also had a ~15% failure rate on CI.

2019/09/05 17:16:48 failed to check package: context canceled
2019/09/05 17:16:48 failed to compute diagnostics: no CheckPackageHandles for file:///tmp/go-test-script854391406/script-function_hover/main.go
myitcv commented 5 years ago

Following a couple of conversations with @muirrn and @stamblerre I've concluded I'm conflating two things here.

1) spurious LogMessage callbacks with "errors that are not actually errors" 2) "functional" errors, where a go-to-definition (in the instance we saw in https://github.com/golang/go/issues/34103#issuecomment-528270271) failed because of a context cancellation

I'm going to leave this issue for category 1. These spurious "errors" appear to manifest when changes to the same file happen in quick succession. The error is apparently reported because the attempt to compute diagnostics for the first change is cancelled by the second change. This is not an error.

I'll create an issue for real errors/failures that fall into category 2 when I have more details.

If we can solve this issue (i.e. the noise from "errors that are not errors") we can turn on more aggressive assertions about the number of errors that are reported to the LSP client.

stamblerre commented 4 years ago

@myitcv: Is this still an issue for you? We've added more handling to ensure that context cancellations aren't logged as errors.

myitcv commented 4 years ago

Yes we're still seeing an issue here.

In the as-yet-unmerged https://github.com/govim/govim/pull/661 we have turned our assertions that there should (in most of our test cases) be no errors/warnings at the end of the test back on. And we have test failures across the board, i.e. the assertions fail.

Per https://github.com/govim/govim/commit/afa4918a6d4d485b6797580c76f894c41a8c2a1b, this uses x/tools and gopls 53017a39 with 11e9d9cc cherry-picked on top.

Some of the errors/warnings we see logged are:

LogMessage callback: &protocol.LogMessageParams{Type:1, Message:"2020/01/08 12:55:45 failed to load snapshot: no packages found for query ./...\n\tdirectory = file:///artefacts/cmd/govim/scenario_default/go-test-script313459836/script-format_on_save_new_file"}
LogMessage callback: &protocol.LogMessageParams{Type:1, Message:"2020/01/08 12:55:46 diagnoseFile: could not generate diagnostics: couldn't exec 'go [list -e -json -compiled=true -test=true -export=false -deps=true -find=false -- /artefacts/cmd/govim/scenario_default/go-test-script313459836/script-format_on_save_new_file]': context canceled *errors.errorString"}
LogMessage callback: &protocol.LogMessageParams{Type:1, Message:"2020/01/08 12:57:16 : stat /artefacts/cmd/govim/scenario_goimports_local/go-test-script051104478/script-new_import/gopath/pkg/mod/rubbish.com/useless@v1.0.0.tmp-875641904/main.go: no such file or directory"}
LogMessage callback: &protocol.LogMessageParams{Type:1, Message:"2020/01/08 12:57:16 no diagnostics: context canceled\n\tURI = file:///artefacts/cmd/govim/scenario_goimports_local/go-test-script051104478/script-new_import/main.go"}
LogMessage callback: &protocol.LogMessageParams{Type:1, Message:"2020/01/08 12:55:14 diagnoseFile: could not generate diagnostics: go [list -f {{context.GOARCH}} {{context.Compiler}} -- unsafe]: signal: killed: "}

All of the gopls log files for this test run are available via:

cd $(mktemp -d) && curl -s https://gist.githubusercontent.com/myitcv/b75a1adc6b140d9f01175740e842920e/raw/adf8afc7922658a97dcfb206094b712b9725e394/logs.base64 | base64 -d | tar -zx
stamblerre commented 4 years ago

Thanks for following up. These logged errors are probably the result of go/packages.Load not returning the pure context.Canceled, but we can add some workarounds.

sbromberger commented 4 years ago

Hi there,

I just updated to latest gopls master and am getting these errors when opening a file in neovim (if it matters, I'm using neovim's built-in lsp client):

LSP[gopls] 2020/01/08 21:10:56 no diagnostics: context canceled
^IURI = file:///home/seth/dev/go/src/github.com/sbromberger/vt1/main.go
Press ENTER or type command to continue

This wasn't happening with my previous gopls install (master from a week or so ago).

stamblerre commented 4 years ago

@sbromberger: My guess is that the errors you are seeing are related to https://github.com/golang/go/issues/36463. I will hopefully have a fix for that issue in the next few days. In the meantime, I'd recommend sticking with the version of gopls tagged latest, since master can be broken at times.

stamblerre commented 4 years ago

@myitcv: These errors should be handled correctly now, as of https://golang.org/cl/214581.

myitcv commented 4 years ago

The only errors we're now seeing (as of 473961ec044c) are:

LogMessage callback: &protocol.LogMessageParams{Type:1, Message:"2020/01/16 12:12:55 diagnose: no workspace packages: context canceled\n\tdirectory = 0xa8d010"}
gopherbot commented 4 years ago

Change https://golang.org/cl/215019 mentions this issue: internal/lsp: don't log context.Cancelation in diagnostics

stamblerre commented 4 years ago

Thanks - one more log statement to clean up then.

myitcv commented 4 years ago

I've just done a test against 0cba7a3a with partial revert of CL 214586 applied on top, in the form of CL 215239, and we're seeing the following error log messages:

LogMessage callback: &protocol.LogMessageParams{Type:1, Message:"2020/01/20 13:37:31 diagnose: no workspace packages: go [list -f {{context.GOARCH}} {{context.Compiler}} -- unsafe]: exit status 1: go: updates to go.mod needed, but contents have changed\n\n\tdirectory = 0xa98af0"}

Figured I'd just re-open this issue because it's basically the same theme as above, albeit it not context cancellation this time.

gopls log file as follows: fail.log

stamblerre commented 4 years ago

This looks like a legitimate error from go list - are you able to consistently reproduce it?

myitcv commented 4 years ago

are you able to consistently reproduce it?

No, it's just showing up in a PR where we cause tests to fail if we see error/warning log messages when they're not expected: https://github.com/govim/govim/pull/661

For example, two failures in this test run:

cd $(mktemp -d) && curl -s https://gist.githubusercontent.com/myitcv/7df92147095b99b1b9ce2c2cb5c9d6c8/raw/a95aab6b4446fd6690414c150e91ddc02b17e834/logs.base64 | base64 -d | tar -zx

Log files that show errors are:

govim/cmd/govim/scenario_default/script-go_to_def_same_file/_tmp/gopls.log
govim/cmd/govim/scenario_default/script-go_to_def_usetab/_tmp/gopls.log
stamblerre commented 4 years ago

Hm, well this does seem to be a legitimate error coming from go list, so it seems fair to me that we are logging it. I think I'll go ahead and close this issue, and if you have a test where this repros consistently, please open a new issue so we can investigate it. I know it looks like the context cancellation errors because of this part: go [list -f {{context.GOARCH}} {{context.Compiler}} -- unsafe], but this is actually just one of the calls to go list done by go/packages, so it is a real error.

myitcv commented 4 years ago

Ok, but in this case we are not expecting any calls to go/packages to fail. So whilst I defer to you on whether this is a real error or not, it's a real problem from my perspective because there should be no reason why these calls are failing here. I'll raise a separate issue tomorrow.

stamblerre commented 4 years ago

Sorry, yes I agree that the error should not be happening. But I don't think that this falls under the issue, since this was about our sporadic logging of useless context cancellation errors. This sounds like a separate issue entirely to me.