golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
122.83k stars 17.51k forks source link

all: racy usage not detected due to assembly code #61204

Open aclements opened 1 year ago

aclements commented 1 year ago

What version of Go are you using (go version)?

$ go version
go version go1.20 linux/amd64

Does this issue reproduce with the latest release?

Yes. Reproduced back to Go 1.10.

What did you do?

Run this program with -race. (Thanks to @cherrymui for putting together the example.)

What did you expect to see?

A race report like the following:

==================
WARNING: DATA RACE
Read at 0x00000051748a by goroutine 6:
  main.IndexByte()
      /home/austin/go.dev/austin/bytesrace.go:29 +0x75
  main.main.func1()
      /home/austin/go.dev/austin/bytesrace.go:18 +0x2a

Previous write at 0x00000051748a by main goroutine:
  main.main()
      /home/austin/go.dev/austin/bytesrace.go:23 +0x65

Goroutine 6 (running) created at:
  main.main()
      /home/austin/go.dev/austin/bytesrace.go:12 +0x30
==================

What did you see instead?

With the call to bytes.IndexByte, there's no race report. There is a race report if you comment out that line and uncomment the line that calls the pure Go IndexByte implementation.

cc @golang/runtime

aclements commented 1 year ago

This happens because bytes.IndexByte calls internal/bytealg.IndexByte, which is implemented in assembly and thus doesn't have any race annotations. Several other functions in bytes are similar.

A few possible solutions:

I stumbled across this because I was looking at the compiler's NoInstrumentPkgs list, which contains almost but not quite all of the dependencies of the runtime package. For the runtime's usage, ideally internal/bytealg would have no race instrumentation, and it's kind of surprising that the instrumentation it does get doesn't cause problems. But, of course, this is in tension with user code calling into bytes, which does want to have race instrumentation.

aclements commented 1 year ago

Another (possibly worse) variation of this problem, this time using just ==, which can be compiled to runtime.memequal under certain circumstances, which doesn't have race instrumentation: https://go.dev/play/p/Rcv6IaDSXdX

In this case, it's the compiler generating the call, so we might be able to just add the missing instrumentation around the call.

cuonglm commented 1 year ago

Fixing problem for bytealg functions seems to be just changing indexbyte_native.go //go:build to:

//go:build (...) && !race

and indexbyte_generic.go to:

//go:build ... || race
aclements commented 1 year ago

Right. We'd have to do that for all of the algorithms in internal/bytealg.

I'm kind of surprised using a race-instrumented version of IndexByteString didn't break the runtime.

cuonglm commented 1 year ago

I'm kind of surprised using a race-instrumented version of IndexByteString didn't break the runtime.

I guess because the function isn't inlined from a race package (internal/bytealg) to a norace package (runtime)?

aclements commented 1 year ago

I'm surprised because there are contexts in the runtime where the race detector will crash or self-loop, regardless of inlining. I guess we just don't use IndexByteString in any such contexts, though depending on that feels like a maintenance hazard.

cherrymui commented 10 months ago

This also happens with other packages that have assembly code, e.g. crypto/aes, see #63817.

egonelbre commented 10 months ago

Adding relevant info from https://github.com/golang/go/issues/63817.

Example of data race not being detected while using crypto/aes https://go.dev/play/p/dK-fJz-p5ey

Going over the standard library, assembly was used in:

$ go list -deps -f "{{if .SFiles}}{{.ImportPath}} {{.SFiles}}{{end}}" std
internal/abi [abi_test.s stub.s]
internal/cpu [cpu.s cpu_x86.s]
internal/bytealg [compare_amd64.s count_amd64.s equal_amd64.s index_amd64.s indexbyte_amd64.s]
runtime/internal/atomic [atomic_amd64.s]
runtime [asm.s asm_amd64.s duff_amd64.s memclr_amd64.s memmove_amd64.s preempt_amd64.s rt0_windows_amd64.s sys_windows_amd64.s test_amd64.s time_windows_amd64.s zcallback_windows.s]
internal/reflectlite [asm.s]
sync/atomic [asm.s]
math [dim_amd64.s exp_amd64.s floor_amd64.s hypot_amd64.s log_amd64.s]
reflect [asm_amd64.s]
hash/crc32 [crc32_amd64.s]
crypto/subtle [xor_amd64.s]
crypto/internal/boring/sig [sig_amd64.s]
crypto/aes [asm_amd64.s gcm_amd64.s]
math/big [arith_amd64.s]
crypto/internal/edwards25519/field [fe_amd64.s]
crypto/internal/nistec [p256_asm_amd64.s]
crypto/internal/bigmod [nat_amd64.s]
crypto/sha512 [sha512block_amd64.s]
crypto/internal/boring/bcache [stub.s]
crypto/md5 [md5block_amd64.s]
crypto/sha1 [sha1block_amd64.s]
crypto/sha256 [sha256block_amd64.s]
vendor/golang.org/x/crypto/internal/poly1305 [sum_amd64.s]
vendor/golang.org/x/sys/cpu [cpu_x86.s]
vendor/golang.org/x/crypto/chacha20poly1305 [chacha20poly1305_amd64.s]
runtime/debug [debug.s]
maps [maps.s]
os/signal [sig.s]
runtime/cgo [asm_amd64.s gcc_amd64.S]
runtime/coverage [dummy.s]
runtime/internal/startlinetest [func_amd64.s]

Similar search for x/crypto packages:

golang.org/x/sys/cpu [cpu_x86.s]
golang.org/x/crypto/blake2b [blake2bAVX2_amd64.s blake2b_amd64.s]
golang.org/x/crypto/argon2 [blamka_amd64.s]
golang.org/x/crypto/blake2s [blake2s_amd64.s]
golang.org/x/crypto/internal/poly1305 [sum_amd64.s]
golang.org/x/crypto/chacha20poly1305 [chacha20poly1305_amd64.s]
golang.org/x/crypto/curve25519/internal/field [fe_amd64.s]
golang.org/x/crypto/salsa20/salsa [salsa20_amd64.s]
golang.org/x/crypto/sha3 [keccakf_amd64.s]

So all of these probably need to be evaluated for possible missing race detection.


There seem to be three main issues:

I guess there are few ways to make the behavior visible to race detector:

Not using assembly can be a significant performance hit in some scenarios, so switching off assembly doesn't seem ideal.

The other approach is to call the race detector functions directly, e.g. runtime.RaceReadRange, runtime.RaceWriteRange. One of the problems is that those functions are only exposed when race tag is present, making it more annoying to instrument third-party packages. Standard library does have access to internal/race, which is conditionally compiled with -race. It would be nice to use a similar package (maybe expose some of it in runtime/race) from third-party party libraries. Of course, it shouldn't expose things like Disable, Enable or Enabled.

Third approach is to add some magic incantation to assembly code to allow the compiler to add appropriate race detection, e.g. rough draft (ignore whether I got the content correct):

// func encryptBlockAsm(nr int, xk *uint32, dst, src *byte)
// race: readrange src, nr*4
// race: readrange dst, nr*4
// race: readrange xk, nr
TEXT ·encryptBlockAsm(SB),NOSPLIT,$0

Of course, writing those annotations in comments is significantly more annoying than using some nicer Go API.


For starters I can add the necessary annotations to standard library using internal/race and create a similar package in golang.org/x/crypto for use there. Unless there's a better suggestion how to approach this issue.

bcmills commented 10 months ago

It seems unfortunate to have manual annotations (including manual calls to runtime.RaceReadRange or similar!) for race reads and writes in assembly code, if nothing can verify that the access pattern described in the annotations. It seems all too easy for the actual implementation to fail to match the annotations, particular as the implementation is revised over time.

Another — probably much slower but higher-fidelity — option might be to have the runtime move the arguments to assembly functions to pages with access protection, and then use a fault handler to step through the assembly routine to observe (and record) the actual access patterns.

egonelbre commented 10 months ago

I definitely agree that manual annotations are fragile, but I don't think they are more fragile than assembly itself.

But, yeah, if there's a way to implement it without having to annotate, then that would be even better. I wasn't able to figure out a way that would work reliably and (reasonably) fast.

egonelbre commented 10 months ago

Sidenote, I did implement helpers for my own purposes:

package race2

import (
    "runtime"
    "unsafe"
)

func ReadSlice[T any](data []T) {
    if len(data) == 0 {
        return
    }
    runtime.RaceReadRange(unsafe.Pointer(unsafe.SliceData(data)), len(data)*int(unsafe.Sizeof(data[0])))
}

func WriteSlice[T any](data []T) {
    if len(data) == 0 {
        return
    }
    runtime.RaceWriteRange(unsafe.Pointer(unsafe.SliceData(data)), len(data)*int(unsafe.Sizeof(data[0])))
}

// didn't need these, but you could have these as well...
func Write[T any](r *T) { ... }
func Read[T any](r *T)  { ... }

Most of the data being read is either a slice or a pointer to a struct.

gopherbot commented 9 months ago

Change https://go.dev/cl/546555 mentions this issue: internal/race: add Read/Write Slice/Value

cuonglm commented 1 month ago

Right. We'd have to do that for all of the algorithms in internal/bytealg.

I'm kind of surprised using a race-instrumented version of IndexByteString didn't break the runtime.

Yay, we can't just do this, because internal/bytealg is considered a runtime package, so it won't be instrumented even when build with -race.

IOW, the build tags condition !race is always true for internal/bytealg.

gopherbot commented 1 month ago

Change https://go.dev/cl/601117 mentions this issue: cmd/compile: add race instrumentation during walkCompare