golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
124.12k stars 17.68k forks source link

cmd/compile: Eliminate bound checks on access to positive modulo #38476

Open cbeuw opened 4 years ago

cbeuw commented 4 years ago

Given something in the form

rem := len(foo) % m
_ = foo[:rem]

The compiler currently (1.14) inserts a bound check on foo[:rem], but 0 <= rem <= len(foo) is guaranteed since rem is the modulo of a positive integer.

It would be even nicer if the compiler can figure out the general form x > 0 ∧ r = x % m ⇒ 0 ≤ r < m, so that operations like

_ = foo[m]
rem := len(foo) % m
_ = foo[rem]

can have the bound check eliminated too.

This sort of expression usually props up when dealing with block-oriented algorithms, such as ciphers: https://github.com/golang/crypto/blob/0848c9571904fcbcb24543358ca8b5a7dbfde875/chacha20/chacha_generic.go#L213 https://github.com/golang/crypto/blob/0848c9571904fcbcb24543358ca8b5a7dbfde875/poly1305/sum_amd64.go#L42

rasky commented 4 years ago

Can you please post a complete snippet that actually compiles?

cbeuw commented 4 years ago
package main

func test1(foo []byte) {
    // _ = foo[0]
    rem := len(foo) % 64
    _ = foo[:rem]  // I originally put _ = foo[rem] which is wrong as rem==len(foo) is possible
}

func test2(foo []byte) {
    _ = foo[63]
    rem := len(foo) % 64
    _ = foo[rem] 
}

func main(){}

Running go build -gcflags="-d=ssa/check_bce/debug=1" gives

./test.go:4:9: Found IsInBounds
./test.go:6:9: Found IsInBounds
./test.go:10:9: Found IsInBounds
./test.go:12:9: Found IsInBounds

The checks on line 4 and 10 are expected, but line 6 and 12 should be eliminated

cuonglm commented 4 years ago

@cbeuw if len(foo) == 0, then rem is 0, so _ = foo[rem] should not be proved.

cbeuw commented 4 years ago

@cuonglm That's correct, which is why _ = foo[0] needs a bounds check. When it is known len(foo) > 0, then _ = foo[rem] is always safe.

cbeuw commented 4 years ago

@cuonglm Actually there is a +-1 error in the first case, because it's possible rem == len(foo). But the second case still stands since rem is strictly less than m

bmkessler commented 4 years ago

The complication with the general case of proving that

r := x%m && m > 0 --> 0 <= r < m

Is that by the time prove runs, the modulus for constant m will have been strength reduced in general to a form that doesn't involve the modulus operator anymore.

x%C => x-(x/C*C)

However, for the specific case above where m is a power of 2 (x%64 above), then the modulus will be rewritten as x&(64-1) by the compiler. So the relation that needs to be inferred in that case is:

r := x&m && m >= 0 --> 0 <= r <= m

AND of a non-negative number is bounded by zero and that non-negative number. I think adding that fact to prove may be fairly straightforward. @zdjones may have more insight.

zdjones commented 4 years ago

I think this may not quite be a duplicate, but requires the same or similar solution as #29872.

We talked there of delaying div reduction to only the lateopt pass. This would leave the mods in place for the prove pass to see.