golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
123.62k stars 17.61k forks source link

cmd/go: TileReader returned bad result slice #31744

Closed bradfitz closed 5 years ago

bradfitz commented 5 years ago

https://storage.googleapis.com/go-build-log/0a52a6b5/windows-386-2008_f461ba84.log

go test proxy running at GOPROXY=http://127.0.0.1:49181/mod
--- FAIL: TestScript (0.00s)
    --- FAIL: TestScript/mod_concurrent (0.09s)
        script_test.go:190: 
            # Concurrent builds should succeed, even if they need to download modules. (0.068s)
            > go build ./x &
            > go build ./y
            [stderr]
            go: finding rsc.io/sampler v1.0.0
            verifying golang.org/x/text@v0.3.0/go.mod: golang.org/x/text@v0.3.0/go.mod: checking tree#82: TileReader returned bad result slice (tile/8/0/000.p/82 len=0, want 2624)
            verifying rsc.io/sampler@v1.0.0/go.mod: rsc.io/sampler@v1.0.0/go.mod: checking tree#82: TileReader returned bad result slice (tile/8/0/000.p/82 len=0, want 2624)
            [exit status 1]
            FAIL: testdata\script\mod_concurrent.txt:5: unexpected command failure

FAIL
FAIL    cmd/go  104.915s
2019/04/29 16:24:38 Failed: exit status 1
bradfitz commented 5 years ago

TryBot failures from this, so please fix or skip soon.

FiloSottile commented 5 years ago

That should never happen. (cit.)

It's a bug in the tlog code, reassigning and looking into it.

Edit: this is actually a bug in the sumweb TileReader implementation, spoke too soon.

FiloSottile commented 5 years ago

This is outside my expertise, something down the chain returned len(0), nil and it shouldn't have, but I can't find the issue in sumweb itself, so it might be in the test, cache, or web interface.

FiloSottile commented 5 years ago

From the same test, possibly related.

go test proxy running at GOPROXY=http://127.0.0.1:50887/mod
--- FAIL: TestScript (0.00s)
    --- FAIL: TestScript/mod_concurrent (0.09s)
        script_test.go:190: 
            # Concurrent builds should succeed, even if they need to download modules. (0.071s)
            > go build ./x &
            > go build ./y
            [stderr]
            go: finding rsc.io/sampler v1.0.0
            go: finding golang.org/x/text v0.3.0
            verifying rsc.io/sampler@v1.0.0/go.mod: rsc.io/sampler@v1.0.0/go.mod: malformed record data
            [exit status 1]
            FAIL: testdata\script\mod_concurrent.txt:5: unexpected command failure

https://build.golang.org/log/40fa251078da62c2fcdc0c81bbba821b61f0e7b1

ianlancetaylor commented 5 years ago

https://storage.googleapis.com/go-build-log/eba4bc15/windows-386-2008_0ed3980b.log

jayconrod commented 5 years ago

I'd guess this is a problem with file locking. It looks like cmd/go/internal/sumweb.dbClient only reads and writes whole files through cmd/go/internal/filelock. Is that enough to protect them or do we need to hold a global lock over all tiles? I'm not sure I understand the tile structure well enough to analyze this.

@rsc will have a better idea of what's going on.

rsc commented 5 years ago

I see the bug and am working on a fix. Currently hung up on OS X directory reading being broken again.

rsc commented 5 years ago

OS X directory reading has magically fixed itself again (it was returning spurious empty strings from f.Readdirnames(-1)!) after git sync + make.bash, despite nothing obvious having happened anywhere near that code today.

gopherbot commented 5 years ago

Change https://golang.org/cl/174439 mentions this issue: cmd/go/internal/modfetch: fix concurrent read/write race in modfetch