golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
124.31k stars 17.7k forks source link

cmd/compile: inliner does not inline `binary.*Endian.Uint*` after many inlines in a big function #68081

Open Jorropo opened 5 months ago

Jorropo commented 5 months ago

Go version

go version devel go1.23-4f77a83589 Tue Jun 18 22:25:08 2024 +0000 linux/amd64

go version 1.22.4

go version 1.21.11

Output of go env in your module/workspace:

GO111MODULE=''
GOARCH='amd64'
GOBIN=''
GOCACHE='/tmp/go-build'
GOENV='/home/hugo/.config/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='amd64'
GOHOSTOS='linux'
GOINSECURE=''
GOMODCACHE='/home/hugo/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='linux'
GOPATH='/home/hugo/go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/home/hugo/k/go'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='local'
GOTOOLDIR='/home/hugo/k/go/pkg/tool/linux_amd64'
GOVCS=''
GOVERSION='devel go1.23-4f77a83589 Tue Jun 18 22:25:08 2024 +0000'
GODEBUG=''
GOTELEMETRY='local'
GOTELEMETRYDIR='/home/hugo/.config/go/telemetry'
GCCGO='/usr/bin/gccgo'
GOAMD64='v3'
AR='ar'
CC='gcc'
CXX='g++'
CGO_ENABLED='1'
GOMOD='/home/hugo/k/xxhash/go.mod'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build639006716=/tmp/go-build -gno-record-gcc-switches'

What did you do?

Sorry for the huge reproducer, I couldn't minize well, I guess this has something to do with the caller aware inliner making poor decisions due to the function being huge.

$ curl https://pastebin.com/raw/rpyZJMD6 > xxhash_slide.go
$ GOSSAFUNC=slide go build xxhash_slide.go

What did you see happen?

 CALL encoding/binary.littleEndian.Uint64(SB)
 CALL encoding/binary.littleEndian.Uint32(SB)

What did you expect to see?

I expect to see raw MOVQ and MOVL for u64 and u32 functions. Note: the bounds check would be proven at compile time, making a call strictly inferior as there are no panic index branch from binary.*.

gabyhelp commented 5 months ago

Similar Issues

(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.)

Jorropo commented 5 months ago

AFAIK the old inliner were hardcoded to attribute a cost of 1 for cheap calls to binary.*Endian.Uint* and friends as well as always inline theses when possible, this sounds better to me as it's usual to make many calls back to back on the same buffer and aware users prove bound checks ahead of time. Usually like this:

_ = b[:32]
a := binary.LittleEndian.Uint64(b)
a := binary.LittleEndian.Uint64(b[8:])
a := binary.LittleEndian.Uint64(b[16:])
a := binary.LittleEndian.Uint64(b[24:])
Jorropo commented 5 months ago

cc @golang/compiler

Jorropo commented 5 months ago

Well actually, I tried go1.21.11 and it has the same issues.

randall77 commented 5 months ago

It's fixed by changing cmd/compile/internal/inline/inl.go:inlineBigFunctionMaxCost to 80, so I'd say your presumption is correct, the function is large enough that we were forced to turn inlining way down. I'm not sure what, if anything, could be done here. We do need to defend against making functions arbitrarily larger by inlining too much. If we could know with some certainty that inlining would make things smaller, then it would be ok. (We do inline your u64 under that rule.) Unfortunately at inlining time littleEndian.Uint64's body looks kind of big, only after lots of optimization work do we realize it is only a bounds check + a load. @thanm @mdempsky for ideas.

randall77 commented 5 months ago

Hmm, we do treat littleEndian.Uint64 specially in the inliner as far as cost goes (See https://go-review.googlesource.com/c/go/+/349931). So maybe these inlinings should still happen?

Jorropo commented 5 months ago

I was about to say, if we have a list of hardcoded "cheap" functions we should also inline theses.