golang / go

The Go programming language
https://go.dev
BSD 3-Clause "New" or "Revised" License
124.05k stars 17.68k forks source link

cmd/compile: interaction between sync/atomic, -race and go:norace #43007

Open dvyukov opened 3 years ago

dvyukov commented 3 years ago

The issue is discovered on gVisor with coverage and race instrumentation. gVisor has some sensitive functions marked as go:norace (calling into race detector will crash I assume). With atomic coverage instrumentation (the mode bazel uses now) these functions get coverage instrumentation with sync/atomic.AddUint32 calls, which is fine per se because these are just increments of a global so can run in almost any context. But if we add race instrumentation to the mix, atomic calls become super complex (call into race runtime) and crash. This is not an issue for all other operations as the function is marked as go:norace.

Question: should compiler leave atomic operations uninstrumented in such case? It currently transforms atomic operations to machine instructions directly in -race is not enabled, so if it just continued doing so for go:norace functions it would solve the issue. However, I don't know if it does this transformation for all arches. Otherwise it will require calling sync/aromic.NoRaceAddUint32. Another option may be //go:nocover.

go version devel +edf60be151 Fri Dec 4 18:03:43 2020 +0000 linux/amd64

Reproducer:

package main

import "sync/atomic"

var sink interface{}

//go:norace
func main() {
    var x uint32
    x++
    atomic.AddUint32(&x, 1)
    _ = &x
}
$ go build -race norace.go
$ objdump --disassemble=main.main norace

0000000000492760 <main.main>:
  492760:   64 48 8b 0c 25 f8 ff    mov    %fs:0xfffffffffffffff8,%rcx
  492767:   ff ff 
  492769:   48 3b 61 10             cmp    0x10(%rcx),%rsp
  49276d:   76 40                   jbe    4927af <main.main+0x4f>
  49276f:   48 83 ec 20             sub    $0x20,%rsp
  492773:   48 89 6c 24 18          mov    %rbp,0x18(%rsp)
  492778:   48 8d 6c 24 18          lea    0x18(%rsp),%rbp
  49277d:   48 8d 05 9c 7d 00 00    lea    0x7d9c(%rip),%rax        # 49a520 <type.*+0x7520>
  492784:   48 89 04 24             mov    %rax,(%rsp)
  492788:   e8 b3 b1 fa ff          callq  43d940 <runtime.newobject>
  49278d:   48 8b 44 24 08          mov    0x8(%rsp),%rax
  492792:   ff 00                   incl   (%rax)
  492794:   48 89 04 24             mov    %rax,(%rsp)
  492798:   c7 44 24 08 01 00 00    movl   $0x1,0x8(%rsp)
  49279f:   00 
  4927a0:   e8 db df ff ff          callq  490780 <sync/atomic.AddUint32>
  4927a5:   48 8b 6c 24 18          mov    0x18(%rsp),%rbp
  4927aa:   48 83 c4 20             add    $0x20,%rsp
  4927ae:   c3                      retq   
  4927af:   e8 ec aa ff ff          callq  48d2a0 <runtime.morestack_noctxt>
 4927b4:    eb aa                   jmp    492760 <main.main>
dvyukov commented 3 years ago

@dean-deng @avagin

prattmic commented 3 years ago

However, I don't know if it does this transformation for all arches.

No, these optimizations don't apply to all arches, and the specific functions that are intrinisified varies from arch-to-arch. They are defined at: https://cs.opensource.google/go/go/+/master:src/cmd/compile/internal/gc/ssa.go;l=3440;drc=07cba70d5794747044ce5f2f3b34de139193e2a5

That said, most of the sync/atomic functions are (also) aliased to the runtime/internal/atomic equivalents in normal builds, on all arches. In race builds, the sync/atomic functions are not aliased, as they are instrumented, but runtime/internal/atomic aren't. Perhaps //go:norace calls could keep the alias even in race builds?

To be honest, though, that sounds really hacky. //go:norace is not a recursive pragma, so arguably it feels like it should be go tool cover's job not to add calls to non-go:norace functions in go:norace functions. Unfortunately, it can't do this because there is no sync/atomic alternative available.

cc @randall77 @dr2chase

randall77 commented 3 years ago

We could stash a flag in the G to handle this. On entry to any //go:norace function, set the flag. On exit (including panic exits, so with a defer I guess), clear the flag. In the sync/atomic fallback code, change if raceenabled { to if raceenabled && !g.raceflag {. The only question is how this would handle other non-sync/atomic calls out of a go:norace function. It could clear/set the flag around each (if go:norace is not recursive), or not (if go:norace is recursive), or possibly error out if there is any such call (if we are sure that go:norace functions shouldn't be calling anything else).

cherrymui commented 3 years ago

It is true that the optimizations don't apply to all arches, but they do apply to all arches that we currently support the race detector. Stopping the aliasing would also do.

I also agree that this sounds really hacky, though.

dvyukov commented 3 years ago

We've worked around this by not doing coverage instrumentation for the files that contain go:norace (bazel --instrumentation_filter flag). This is not urgent and to be clear: I am not sure what to do here and if we need to do anything at all. Just wanted to write this down and check for other opinions or any potential synergy. So perhaps we close this for now and keep for future reference?

dvyukov commented 3 years ago

To be honest, though, that sounds really hacky. //go:norace is not a recursive pragma, so arguably it feels like it should be go tool cover's job not to add calls to non-go:norace functions in go:norace functions. Unfortunately, it can't do this because there is no sync/atomic alternative available.

Potentially it could emit non-atomic counter increments for these functions. It would solve the problem at hand... but arguably even more hacky :)

The only question is how this would handle other non-sync/atomic calls out of a go:norace function.

Good question. I don't have an answer. But what happened here is that the function did not contain any calls whatsoever (it's like you are in a signal handler thunk with no stack at all or something), calls were inserted by one instrumentation and then messed up by another :)

It is true that the optimizations don't apply to all arches, but they do apply to all arches that we currently support the race detector.

Good. If there are runtime internal versions, then I guess can fallback onto them if race is supported on more arches.