open-telemetry / opamp-go

OpAMP protocol implementation in Go
Apache License 2.0
149 stars 71 forks source link

TestNewSupervisor has a flaky race condition #290

Open tpaschalis opened 5 months ago

tpaschalis commented 5 months ago

I suddenly got a flake from TestNewSupervisor while running make all on main (ed38d5f4bf930b57e04581919fbfb676aaa5a5af).

$ make all
go test -race ./...
?       github.com/open-telemetry/opamp-go/client/types [no test files]
?       github.com/open-telemetry/opamp-go/internal/testhelpers [no test files]
?       github.com/open-telemetry/opamp-go/protobufs    [no test files]
?       github.com/open-telemetry/opamp-go/protobufshelpers     [no test files]
?       github.com/open-telemetry/opamp-go/server/types [no test files]
ok      github.com/open-telemetry/opamp-go/client       18.988s
ok      github.com/open-telemetry/opamp-go/client/internal      (cached)
ok      github.com/open-telemetry/opamp-go/internal     (cached)
ok      github.com/open-telemetry/opamp-go/server       (cached)
cd internal/examples && go test -race ./...
?       github.com/open-telemetry/opamp-go/internal/examples/agent      [no test files]
?       github.com/open-telemetry/opamp-go/internal/examples/agent/agent        [no test files]
?       github.com/open-telemetry/opamp-go/internal/examples/server     [no test files]
?       github.com/open-telemetry/opamp-go/internal/examples/server/certman     [no test files]
?       github.com/open-telemetry/opamp-go/internal/examples/server/data        [no test files]
?       github.com/open-telemetry/opamp-go/internal/examples/server/opampsrv    [no test files]
?       github.com/open-telemetry/opamp-go/internal/examples/server/uisrv       [no test files]
?       github.com/open-telemetry/opamp-go/internal/examples/supervisor [no test files]
?       github.com/open-telemetry/opamp-go/internal/examples/supervisor/supervisor/commander    [no test files]
?       github.com/open-telemetry/opamp-go/internal/examples/supervisor/supervisor/config       [no test files]
?       github.com/open-telemetry/opamp-go/internal/examples/supervisor/supervisor/healthchecker        [no test files]
2024/06/30 20:05:03.280981 [OPAMP] Could not load TLS config, working without TLS: open ../../certs/certs/ca.cert.pem: no such file or directory
==================
WARNING: DATA RACE
Write at 0x00c0001e3160 by goroutine 10:
  github.com/open-telemetry/opamp-go/internal/examples/supervisor/supervisor/commander.(*Commander).Start()
      /home/tpaschalis/GitRepos/opamp-go/internal/examples/supervisor/supervisor/commander/commander.go:52 +0x226
  github.com/open-telemetry/opamp-go/internal/examples/supervisor/supervisor.(*Supervisor).startAgent()
      /home/tpaschalis/GitRepos/opamp-go/internal/examples/supervisor/supervisor/supervisor.go:415 +0x64
  github.com/open-telemetry/opamp-go/internal/examples/supervisor/supervisor.(*Supervisor).runAgentProcess()
      /home/tpaschalis/GitRepos/opamp-go/internal/examples/supervisor/supervisor/supervisor.go:512 +0x79a
  github.com/open-telemetry/opamp-go/internal/examples/supervisor/supervisor.NewSupervisor.func1()
      /home/tpaschalis/GitRepos/opamp-go/internal/examples/supervisor/supervisor/supervisor.go:111 +0x33

Previous read at 0x00c0001e3160 by goroutine 7:
  github.com/open-telemetry/opamp-go/internal/examples/supervisor/supervisor/commander.(*Commander).Stop()
      /home/tpaschalis/GitRepos/opamp-go/internal/examples/supervisor/supervisor/commander/commander.go:119 +0x64
  github.com/open-telemetry/opamp-go/internal/examples/supervisor/supervisor.(*Supervisor).Shutdown()
      /home/tpaschalis/GitRepos/opamp-go/internal/examples/supervisor/supervisor/supervisor.go:540 +0xae
  github.com/open-telemetry/opamp-go/internal/examples/supervisor/supervisor.TestNewSupervisor()
      /home/tpaschalis/GitRepos/opamp-go/internal/examples/supervisor/supervisor/supervisor_test.go:63 +0x175
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1595 +0x261
  testing.(*T).Run.func1()
      /usr/local/go/src/testing/testing.go:1648 +0x44

Goroutine 10 (running) created at:
  github.com/open-telemetry/opamp-go/internal/examples/supervisor/supervisor.NewSupervisor()
      /home/tpaschalis/GitRepos/opamp-go/internal/examples/supervisor/supervisor/supervisor.go:111 +0x8f9
  github.com/open-telemetry/opamp-go/internal/examples/supervisor/supervisor.TestNewSupervisor()
      /home/tpaschalis/GitRepos/opamp-go/internal/examples/supervisor/supervisor/supervisor_test.go:60 +0x147
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1595 +0x261
  testing.(*T).Run.func1()
      /usr/local/go/src/testing/testing.go:1648 +0x44

Goroutine 7 (running) created at:
  testing.(*T).Run()
      /usr/local/go/src/testing/testing.go:1648 +0x845
  testing.runTests.func1()
      /usr/local/go/src/testing/testing.go:2054 +0x84
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1595 +0x261
  testing.runTests()
      /usr/local/go/src/testing/testing.go:2052 +0x8ad
  testing.(*M).Run()
      /usr/local/go/src/testing/testing.go:1925 +0xcd7
  main.main()
      _testmain.go:47 +0x2bd
==================
--- FAIL: TestNewSupervisor (0.00s)
    testing.go:1465: race detected during execution of test
FAIL
FAIL    github.com/open-telemetry/opamp-go/internal/examples/supervisor/supervisor      0.009s
FAIL
make: *** [makefile:18: test] Error 1

The race condition is not caught every time by the race detector.

echlebek commented 6 days ago

Maybe you encountered the same thing I did? https://github.com/open-telemetry/opamp-go/issues/256