golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
122.86k stars 17.52k forks source link

cmd/go: 'list std' unnecessarily opens *_test.go files #31925

Open bradfitz opened 5 years ago

bradfitz commented 5 years ago

While working in an environment with a slow filesystem (immediate stats+readdirs, but slow opens), I noticed that go list std was very slow (~6 seconds), which prompted me to look at what it was doing.

One thing that jumped out was that it's opening *_test.go files, which should not be necessary to answer the question of whether a directory in $GOROOT contains a Go package.

$ go install cmd/go && strace -f -e open,openat go list std 2>&1 | grep _test.go | wc -l
2181

That's about 37% of the overall files opened:

$ go install cmd/go && strace -f -e open,openat go list std 2>&1 | grep O_RDONLY | grep '.go"' | wc -l
5887

Overall, we read 67 MB in from disk just to answer go list std:

$ strace -o trace -f -e read go list std > /dev/null && perl -ne '$num += $1 if /.+?(\d+)/; END{ print "$num\n" }' < trace
67894893

If I make os.Open panic on an _test.go open, I see a number of backtraces just by re-running it a few times:

panic: os.OpenFile opening /home/bradfitz/go/src/regexp/syntax/parse_test.go

goroutine 1 [running]:
os.OpenFile(0xc00002fbc0, 0x31, 0x0, 0x0, 0xc0001565ac, 0x0, 0x200)
        /home/bradfitz/go/src/os/file.go:296 +0x1ca
os.Open(...)
        /home/bradfitz/go/src/os/file.go:276
cmd/go/internal/imports.scanFiles(0xc00027e400, 0xb, 0x10, 0xc000113080, 0x0, 0xc00027e400, 0x8, 0x10, 0x0, 0x7f200d2c5ab0, ...)
        /home/bradfitz/go/src/cmd/go/internal/imports/scan.go:52 +0x20a
cmd/go/internal/imports.ScanDir(0xc0001db740, 0x23, 0xc000113080, 0xc0003bd8e8, 0x47d6f7, 0xc000450256, 0xd, 0xc000156740, 0xffffffffffffffff, 0x97e620, ...)
        /home/bradfitz/go/src/cmd/go/internal/imports/scan.go:39 +0x20b
cmd/go/internal/modload.scanDir(0xc0001db740, 0x23, 0xc000113080, 0x3, 0x0, 0x0, 0xc0001db740, 0x23, 0x0, 0x0, ...)
        /home/bradfitz/go/src/cmd/go/internal/modload/load.go:863 +0x5a
cmd/go/internal/modload.(*loader).doPkg(0xc0000726c0, 0x98dd20, 0xc000397cc0)
        /home/bradfitz/go/src/cmd/go/internal/modload/load.go:771 +0x46c
cmd/go/internal/par.(*Work).runner(0xc00017c080)
        /home/bradfitz/go/src/cmd/go/internal/par/work.go:101 +0x157
cmd/go/internal/par.(*Work).Do(0xc00017c080, 0xa, 0xc0002d8290)
        /home/bradfitz/go/src/cmd/go/internal/par/work.go:69 +0xbf
cmd/go/internal/modload.(*loader).load(0xc0000726c0, 0xc000156f60)
        /home/bradfitz/go/src/cmd/go/internal/modload/load.go:641 +0x9ed
cmd/go/internal/modload.loadPatterns(0xc000020710, 0x1, 0x1, 0xc000149901, 0xc0001571d8, 0xd0, 0x9cf140, 0x448467)
        /home/bradfitz/go/src/cmd/go/internal/modload/load.go:262 +0x25f
cmd/go/internal/modload.ImportPathsQuiet(0xc000020710, 0x1, 0x1, 0x203000, 0x203000, 0x0)
        /home/bradfitz/go/src/cmd/go/internal/modload/load.go:173 +0x84
cmd/go/internal/modload.ImportPaths(0xc000020710, 0x1, 0x1, 0xd0, 0x9cf080, 0xc000034000)
        /home/bradfitz/go/src/cmd/go/internal/modload/load.go:57 +0x43
cmd/go/internal/load.ImportPaths(0xc000020710, 0x1, 0x1, 0x0, 0xe7cf60, 0x7f200d35a008)
        /home/bradfitz/go/src/cmd/go/internal/load/pkg.go:1999 +0x5f
cmd/go/internal/load.PackagesAndErrors(0xc000020710, 0x1, 0x1, 0x0, 0x0, 0x0)
        /home/bradfitz/go/src/cmd/go/internal/load/pkg.go:1946 +0xcd
cmd/go/internal/load.Packages(0xc000020710, 0x1, 0x1, 0xc0001268c0, 0x0, 0x0)
        /home/bradfitz/go/src/cmd/go/internal/load/pkg.go:1923 +0x5a
cmd/go/internal/list.runList(0xe707a0, 0xc000020710, 0x1, 0x1)
        /home/bradfitz/go/src/cmd/go/internal/list/list.go:448 +0x2210
main.main()
        /home/bradfitz/go/src/cmd/go/main.go:188 +0x57f

and

goroutine 279 [running]:
os.OpenFile(0xc00027f3e0, 0x2f, 0x0, 0x0, 0xc000057ce0, 0x0, 0x400)
        /home/bradfitz/go/src/os/file.go:296 +0x1ca
os.Open(...)
        /home/bradfitz/go/src/os/file.go:276
cmd/go/internal/imports.scanFiles(0xc000178700, 0xf, 0x10, 0xc000113080, 0x0, 0xc000178700, 0x8, 0x10, 0x0, 0x0, ...)
        /home/bradfitz/go/src/cmd/go/internal/imports/scan.go:52 +0x20a
cmd/go/internal/imports.ScanDir(0xc00027eba0, 0x21, 0xc000113080, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
        /home/bradfitz/go/src/cmd/go/internal/imports/scan.go:39 +0x20b
cmd/go/internal/modload.scanDir(0xc00027eba0, 0x21, 0xc000113080, 0x3, 0x0, 0x0, 0xc00027eba0, 0x21, 0x0, 0x0, ...)
        /home/bradfitz/go/src/cmd/go/internal/modload/load.go:863 +0x5a
cmd/go/internal/modload.(*loader).doPkg(0xc0000726c0, 0x98dd20, 0xc000368d20)
        /home/bradfitz/go/src/cmd/go/internal/modload/load.go:771 +0x46c
cmd/go/internal/par.(*Work).runner(0xc000176080)
        /home/bradfitz/go/src/cmd/go/internal/par/work.go:101 +0x157
created by cmd/go/internal/par.(*Work).Do
        /home/bradfitz/go/src/cmd/go/internal/par/work.go:67 +0x9b

I tried to hide all the *_test.go files by altering the build context early in the list command,

diff --git a/src/cmd/go/internal/list/list.go b/src/cmd/go/internal/list/list.go
index 4a6633d9a1..0d0c8d36ad 100644
--- a/src/cmd/go/internal/list/list.go
+++ b/src/cmd/go/internal/list/list.go
@@ -9,7 +9,9 @@ import (
        "bufio"
        "bytes"
        "encoding/json"
+       "fmt"
        "io"
+       "io/ioutil"
        "os"
        "sort"
        "strings"
@@ -312,6 +314,7 @@ func runList(cmd *base.Command, args []string) {
        out := newTrackingWriter(os.Stdout)
        defer out.w.Flush()

+       hadEmptyFmt := *listFmt == ""
        if *listFmt == "" {
                if *listM {
                        *listFmt = "{{.String}}"
@@ -417,6 +420,26 @@ func runList(cmd *base.Command, args []string) {
                base.Fatalf("go list -test cannot be used with -find")
        }

+       // Can we skip processing *_test.go files?
+       if !*listTest && !*listJson && hadEmptyFmt {
+               println("filtering")
+               readDir := cfg.BuildContext.ReadDir
+               if readDir == nil {
+                       readDir = ioutil.ReadDir
+               }
+               cfg.BuildContext.ReadDir = func(dir string) ([]os.FileInfo, error) {
+                       println(fmt.Sprintf("in filtered ReadDir(%q)", dir))
+                       fis, err := readDir(dir)
+                       filtered := fis[:0]
+                       for _, fi := range fis {
+                               if !strings.HasSuffix(fi.Name(), "_test.go") {
+                                       filtered = append(filtered, fi)
+                               }
+                       }
+                       return filtered, err
+               }
+       }
+
        load.IgnoreImports = *listFind
        var pkgs []*load.Package
        if *listE {

... but there are several places that are concurrently pre-loading some data, making that not fully effective. (It helps a bit).

/cc @bcmills @jayconrod @rsc @ianlancetaylor

bcmills commented 5 years ago

This seems like a specific example of the more general problem in #29666.

(If you're just listing the package names, we shouldn't be doing any work beyond computing the package names.)

quantonganh commented 10 months ago

This has been fixed?

$ go list std 2>&1 | grep -c _test.go
0
bcmills commented 9 months ago

@quantonganh, the strace command in the steps to reproduce is an important detail.

~/go-review/src$ strace -f -e open,openat go list std 2>&1 | grep _test.go | wc -l
1035