cosmos / cosmos-sdk

:chains: A Framework for Building High Value Public Blockchains :sparkles:
https://cosmos.network/
Apache License 2.0
6.27k stars 3.64k forks source link

[Bug]: Data race when querying gRPC directly #22368

Open corverroos opened 4 weeks ago

corverroos commented 4 weeks ago

Is there an existing issue for this?

What happened?

We at omni enabled direct gRPC based queries of application state (previously we queried via CometBFT). We have a go unit test that runs a single validator, while quering state using gRPC directly. When run with -race the tests fails with "WARNING: DATA RACE". See stack trace below.

It seems like the cause is one or more fields in BaseApp that are not protected by a mutex, yet are written-to by the ABCI goroutine and read-from gRPC query goroutines. The stack trace shows that checkState is written-to in Commit, and read-from in CreateQueryContext. There may be more fields though.

As a workaround, we only these tests without -race.

Cosmos SDK Version

v0.50.10

How to reproduce?

Run this test but remove the -race build flag and run with go test -race.

corverroos commented 4 weeks ago

Data race stack trace:

==================
WARNING: DATA RACE
Read at 0x00c0058a4a20 by goroutine 2373:
  github.com/cosmos/cosmos-sdk/baseapp.(*BaseApp).CreateQueryContext()
      /home/runner/go/pkg/mod/github.com/cosmos/cosmos-sdk@v0.50.10/baseapp/abci.go:1233 +0x2e4
  github.com/cosmos/cosmos-sdk/baseapp.(*BaseApp).RegisterGRPCServer.func1()
      /home/runner/go/pkg/mod/github.com/cosmos/cosmos-sdk@v0.50.10/baseapp/grpcserver.go:49 +0x1aa
  github.com/cosmos/cosmos-sdk/baseapp.(*BaseApp).RegisterGRPCServer.func2.ChainUnaryServer.2.1()
      /home/runner/go/pkg/mod/github.com/grpc-ecosystem/go-grpc-middleware@v1.4.0/chain.go:48 +0xd0
  github.com/grpc-ecosystem/go-grpc-middleware/recovery.UnaryServerInterceptor.func1()
      /home/runner/go/pkg/mod/github.com/grpc-ecosystem/go-grpc-middleware@v1.4.0/recovery/interceptors.go:33 +0x174
  github.com/cosmos/cosmos-sdk/baseapp.(*BaseApp).RegisterGRPCServer.func2.ChainUnaryServer.2()
      /home/runner/go/pkg/mod/github.com/grpc-ecosystem/go-grpc-middleware@v1.4.0/chain.go:53 +0x22c
  github.com/omni-network/omni/halo/portal/types._Query_Block_Handler()
      /home/runner/work/omni/omni/halo/portal/types/query.pb.go:327 +0x1f3
  github.com/cosmos/cosmos-sdk/baseapp.(*BaseApp).RegisterGRPCServer.func2()
      /home/runner/go/pkg/mod/github.com/cosmos/cosmos-sdk@v0.50.10/baseapp/grpcserver.go:81 +0x1b5
  google.golang.org/grpc.(*Server).processUnaryRPC()
      /home/runner/go/pkg/mod/google.golang.org/grpc@v1.67.1/server.go:1394 +0x1b4b
  google.golang.org/grpc.(*Server).handleStream()
      /home/runner/go/pkg/mod/google.golang.org/grpc@v1.67.1/server.go:1805 +0x1824
  google.golang.org/grpc.(*Server).serveStreams.func2.1()
      /home/runner/go/pkg/mod/google.golang.org/grpc@v1.67.1/server.go:1029 +0x158
Previous write at 0x00c0058a4a20 by goroutine 1879:
  github.com/cosmos/cosmos-sdk/baseapp.(*BaseApp).setState()
      /home/runner/go/pkg/mod/github.com/cosmos/cosmos-sdk@v0.50.10/baseapp/baseapp.go:497 +0x7f6
  github.com/cosmos/cosmos-sdk/baseapp.(*BaseApp).Commit()
      /home/runner/go/pkg/mod/github.com/cosmos/cosmos-sdk@v0.50.10/baseapp/abci.go:956 +0x59a
  github.com/omni-network/omni/halo/app.(*App).Commit()
      <autogenerated>:1 +0x52
  github.com/cosmos/cosmos-sdk/server.cometABCIWrapper.Commit()
      /home/runner/go/pkg/mod/github.com/cosmos/cosmos-sdk@v0.50.10/server/cmt_abci.go:56 +0x4a
  github.com/cosmos/cosmos-sdk/server.(*cometABCIWrapper).Commit()
      <autogenerated>:1 +0x1b
  github.com/omni-network/omni/halo/app.abciWrapper.Commit()
      /home/runner/work/omni/omni/halo/app/abci.go:160 +0xa5
  github.com/omni-network/omni/halo/app.(*abciWrapper).Commit()
      <autogenerated>:1 +0x7b
  github.com/cometbft/cometbft/abci/client.(*localClient).Commit()
      /home/runner/go/pkg/mod/github.com/cometbft/cometbft@v0.38.12/abci/client/local_client.go:113 +0x158
  github.com/cometbft/cometbft/proxy.(*appConnConsensus).Commit()
      /home/runner/go/pkg/mod/github.com/cometbft/cometbft@v0.38.12/proxy/app_conn.go:109 +0x2cb
  github.com/cometbft/cometbft/state.(*BlockExecutor).Commit()
      /home/runner/go/pkg/mod/github.com/cometbft/cometbft@v0.38.12/state/execution.go:404 +0x368
  github.com/cometbft/cometbft/state.(*BlockExecutor).applyBlock()
      /home/runner/go/pkg/mod/github.com/cometbft/cometbft@v0.38.12/state/execution.go:290 +0x1a39
  github.com/cometbft/cometbft/state.(*BlockExecutor).ApplyVerifiedBlock()
      /home/runner/go/pkg/mod/github.com/cometbft/cometbft@v0.38.12/state/execution.go:202 +0x143e
  github.com/cometbft/cometbft/consensus.(*State).finalizeCommit()
      /home/runner/go/pkg/mod/github.com/cometbft/cometbft@v0.38.12/consensus/state.go:1772 +0x1214
  github.com/cometbft/cometbft/consensus.(*State).tryFinalizeCommit()
      /home/runner/go/pkg/mod/github.com/cometbft/cometbft@v0.38.12/consensus/state.go:1682 +0x4a4
  github.com/cometbft/cometbft/consensus.(*State).enterCommit.func1()
      /home/runner/go/pkg/mod/github.com/cometbft/cometbft@v0.38.12/consensus/state.go:1617 +0x114
  runtime.deferreturn()
      /opt/hostedtoolcache/go/1.23.2/x64/src/runtime/panic.go:605 +0x5d
  github.com/cometbft/cometbft/consensus.(*State).addVote()
      /home/runner/go/pkg/mod/github.com/cometbft/cometbft@v0.38.12/consensus/state.go:2335 +0x36c4
  github.com/cometbft/cometbft/consensus.(*State).tryAddVote()
      /home/runner/go/pkg/mod/github.com/cometbft/cometbft@v0.38.12/consensus/state.go:2067 +0x6b
  github.com/cometbft/cometbft/consensus.(*State).handleMsg()
      /home/runner/go/pkg/mod/github.com/cometbft/cometbft@v0.38.12/consensus/state.go:929 +0x60f
  github.com/cometbft/cometbft/consensus.(*State).receiveRoutine()
      /home/runner/go/pkg/mod/github.com/cometbft/cometbft@v0.38.12/consensus/state.go:856 +0x751
  github.com/cometbft/cometbft/consensus.(*State).OnStart.gowrap1()
      /home/runner/go/pkg/mod/github.com/cometbft/cometbft@v0.38.12/consensus/state.go:398 +0x35
Goroutine 2373 (running) created at:
  google.golang.org/grpc.(*Server).serveStreams.func2()
      /home/runner/go/pkg/mod/google.golang.org/grpc@v1.67.1/server.go:1040 +0x224
  google.golang.org/grpc/internal/transport.(*http2Server).operateHeaders()
      /home/runner/go/pkg/mod/google.golang.org/grpc@v1.67.1/internal/transport/http2_server.go:630 +0x3881
  google.golang.org/grpc/internal/transport.(*http2Server).HandleStreams()
      /home/runner/go/pkg/mod/google.golang.org/grpc@v1.67.1/internal/transport/http2_server.go:671 +0x415
  google.golang.org/grpc.(*Server).serveStreams()
      /home/runner/go/pkg/mod/google.golang.org/grpc@v1.67.1/server.go:1023 +0x69b
  google.golang.org/grpc.(*Server).handleRawConn.func1()
      /home/runner/go/pkg/mod/google.golang.org/grpc@v1.67.1/server.go:958 +0x86
Goroutine 1879 (running) created at:
  github.com/cometbft/cometbft/consensus.(*State).OnStart()
      /home/runner/go/pkg/mod/github.com/cometbft/cometbft@v0.38.12/consensus/state.go:398 +0x22a
  github.com/cometbft/cometbft/libs/service.(*BaseService).Start()
      /home/runner/go/pkg/mod/github.com/cometbft/cometbft@v0.38.12/libs/service/service.go:144 +0x583
  github.com/cometbft/cometbft/consensus.(*Reactor).OnStart()
      /home/runner/go/pkg/mod/github.com/cometbft/cometbft@v0.38.12/consensus/reactor.go:84 +0x214
  github.com/cometbft/cometbft/libs/service.(*BaseService).Start()
      /home/runner/go/pkg/mod/github.com/cometbft/cometbft@v0.38.12/libs/service/service.go:144 +0x583
  github.com/cometbft/cometbft/consensus.(*Reactor).Start()
      <autogenerated>:1 +0x31
  github.com/cometbft/cometbft/p2p.(*Switch).OnStart()
      /home/runner/go/pkg/mod/github.com/cometbft/cometbft@v0.38.12/p2p/switch.go:237 +0x115
  github.com/cometbft/cometbft/libs/service.(*BaseService).Start()
      /home/runner/go/pkg/mod/github.com/cometbft/cometbft@v0.38.12/libs/service/service.go:144 +0x583
  github.com/cometbft/cometbft/node.(*Node).OnStart()
      /home/runner/go/pkg/mod/github.com/cometbft/cometbft@v0.38.12/node/node.go:558 +0x70b
  github.com/cometbft/cometbft/libs/service.(*BaseService).Start()
      /home/runner/go/pkg/mod/github.com/cometbft/cometbft@v0.38.12/libs/service/service.go:144 +0x583
  github.com/omni-network/omni/halo/app.Start()
      /home/runner/work/omni/omni/halo/app/start.go:207 +0x1784
  github.com/omni-network/omni/halo/app_test.TestSmoke()
      /home/runner/work/omni/omni/halo/app/start_test.go:42 +0x1d3
  testing.tRunner()
      /opt/hostedtoolcache/go/1.23.2/x64/src/testing/testing.go:1690 +0x226
  testing.(*T).Run.gowrap1()
      /opt/hostedtoolcache/go/1.23.2/x64/src/testing/testing.go:1743 +0x44
==================
fabtreb commented 3 weeks ago

@tac0turtle hello, any thoughts on priority yet for this issue?