UnsafeRange panicking during shutdown

ahrtr commented 11 months ago

Bug report criteria

[ ] This bug report is not security related, security issues should be disclosed privately via security@etcd.io.
[ ] This is not a support request or question, support requests or questions should be raised in the etcd discussion forums.
[ ] You have read the etcd bug reporting guidelines.
[ ] Existing open issues along with etcd frequently asked questions have been checked and this is not a duplicate.

What happened?

Test case TestMaintenanceSnapshotCancel failed and panicking.

Refer to https://github.com/etcd-io/etcd/actions/runs/7463174417/job/20307221683?pr=17220

Based on the log, the reason should be that the backend has already been closed (the member is being stopped) before the snapshot operation, https://github.com/etcd-io/etcd/blob/a2eb17c8091893796e835cd564c78a7b4c917c21/server/storage/backend/backend.go#L331

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0xa6c27c]

goroutine 19029 [running]:
go.etcd.io/bbolt.(*Cursor).seek(0xc00044a578, {0x1aa05d4, 0x4, 0x4})
    /home/runner/go/pkg/mod/go.etcd.io/bbolt@v1.3.8/cursor.go:159 +0x7c
go.etcd.io/bbolt.(*Bucket).Bucket(0xc0009da638, {0x1aa05d4, 0x4, 0x4})
    /home/runner/go/pkg/mod/go.etcd.io/bbolt@v1.3.8/bucket.go:105 +0x10c
go.etcd.io/bbolt.(*Tx).Bucket(...)
    /home/runner/go/pkg/mod/go.etcd.io/bbolt@v1.3.8/tx.go:102
go.etcd.io/etcd/server/v3/storage/backend.(*batchTx).UnsafeRange(0xc0017a7da0, {0x12e17b0, 0x1b0dc40}, {0x1aa1670, 0x10, 0x10}, {0x0, 0x0, 0x0}, 0xc4798?)
    /home/runner/actions-runner/_work/etcd/etcd/server/storage/backend/batch_tx.go:174 +0xa0
go.etcd.io/etcd/server/v3/storage/schema.UnsafeReadConsistentIndex({0xffff39290758, 0xc0017a7da0})
    /home/runner/actions-runner/_work/etcd/etcd/server/storage/schema/cindex.go:41 +0xa0
go.etcd.io/etcd/server/v3/storage/schema.unsafeUpdateConsistentIndex.func1()
    /home/runner/actions-runner/_work/etcd/etcd/server/storage/schema/cindex.go:80 +0x54
go.etcd.io/etcd/client/pkg/v3/verify.Verify(0xc00044aa28)
    /home/runner/actions-runner/_work/etcd/etcd/client/pkg/verify/verify.go:71 +0x44
go.etcd.io/etcd/server/v3/storage/schema.unsafeUpdateConsistentIndex({0x12e5920, 0xc0017a7da0}, 0x7, 0x2, 0x0)
    /home/runner/actions-runner/_work/etcd/etcd/server/storage/schema/cindex.go:79 +0x1d4
go.etcd.io/etcd/server/v3/storage/schema.UnsafeUpdateConsistentIndex(...)
    /home/runner/actions-runner/_work/etcd/etcd/server/storage/schema/cindex.go:67
go.etcd.io/etcd/server/v3/etcdserver/cindex.(*consistentIndex).UnsafeSave(0xc001727800, {0x12e5920, 0xc0017a7da0})
    /home/runner/actions-runner/_work/etcd/etcd/server/etcdserver/cindex/cindex.go:121 +0x68
go.etcd.io/etcd/server/v3/storage.(*BackendHooks).OnPreCommitUnsafe(0xc001a0d9e0, {0x12e5920?, 0xc0017a7da0})
    /home/runner/actions-runner/_work/etcd/etcd/server/storage/hooks.go:45 +0x64
go.etcd.io/etcd/server/v3/storage/backend.(*batchTxBuffered).unsafeCommit(0xc0017a7da0, 0x0)
    /home/runner/actions-runner/_work/etcd/etcd/server/storage/backend/batch_tx.go:342 +0xfc
go.etcd.io/etcd/server/v3/storage/backend.(*batchTxBuffered).commit(0xc0017a7da0, 0xdc?)
    /home/runner/actions-runner/_work/etcd/etcd/server/storage/backend/batch_tx.go:335 +0x70
go.etcd.io/etcd/server/v3/storage/backend.(*batchTxBuffered).Commit(0xc0017a7da0)
    /home/runner/actions-runner/_work/etcd/etcd/server/storage/backend/batch_tx.go:322 +0x3c
go.etcd.io/etcd/server/v3/storage/backend.(*backend).Snapshot(0xc000329140)
    /home/runner/actions-runner/_work/etcd/etcd/server/storage/backend/backend.go:331 +0x54
go.etcd.io/etcd/server/v3/etcdserver/api/v3rpc.(*maintenanceServer).Snapshot(0xc0029ace40, 0x12e1120?, {0x12e6550, 0xc0006302b0})
    /home/runner/actions-runner/_work/etcd/etcd/server/etcdserver/api/v3rpc/maintenance.go:111 +0x110
go.etcd.io/etcd/server/v3/etcdserver/api/v3rpc.(*authMaintenanceServer).Snapshot(0xc0019d7b60, 0x103b4c0?, {0x12e6550, 0xc0006302b0})
    /home/runner/actions-runner/_work/etcd/etcd/server/etcdserver/api/v3rpc/maintenance.go:296 +0xb8
go.etcd.io/etcd/api/v3/etcdserverpb._Maintenance_Snapshot_Handler({0x1004a40?, 0xc0019d7b60}, {0x12e3fc8, 0xc0028e4c00})
    /home/runner/actions-runner/_work/etcd/etcd/api/etcdserverpb/rpc.pb.go:7620 +0xe8
github.com/grpc-ecosystem/go-grpc-prometheus.init.(*ServerMetrics).StreamServerInterceptor.func4({0x1004a40, 0xc0019d7b60}, {0x12e3d80?, 0xc0005681e0}, 0xc0028e4bd0, 0x111e8a8)
    /home/runner/go/pkg/mod/github.com/grpc-ecosystem/go-grpc-prometheus@v1.2.0/server_metrics.go:121 +0x128
go.etcd.io/etcd/server/v3/etcdserver/api/v3rpc.Server.ChainStreamServer.func7.1.1({0x1004a40, 0xc0019d7b60}, {0x12e3d80, 0xc0005681e0})
    /home/runner/go/pkg/mod/github.com/grpc-ecosystem/go-grpc-middleware@v1.3.0/chain.go:49 +0x70
go.etcd.io/etcd/server/v3/etcdserver/api/v3rpc.newStreamInterceptor.func1({0x1004a40, 0xc0019d7b60}, {0x12e3d80, 0xc0005681e0}, 0xc0028e4bd0, 0xc001992240)
    /home/runner/actions-runner/_work/etcd/etcd/server/etcdserver/api/v3rpc/interceptor.go:258 +0x560
go.etcd.io/etcd/server/v3/etcdserver/api/v3rpc.Server.ChainStreamServer.func7.1.1({0x1004a40, 0xc0019d7b60}, {0x12e3d80, 0xc0005681e0})
    /home/runner/go/pkg/mod/github.com/grpc-ecosystem/go-grpc-middleware@v1.3.0/chain.go:49 +0x70
go.etcd.io/etcd/server/v3/etcdserver/api/v3rpc.Server.ChainStreamServer.func7({0x1004a40, 0xc0019d7b60}, {0x12e3d80, 0xc0005681e0}, 0xc0028e4bd0, 0x111e8a8)
    /home/runner/go/pkg/mod/github.com/grpc-ecosystem/go-grpc-middleware@v1.3.0/chain.go:58 +0x100
google.golang.org/grpc.(*Server).processStreamingRPC(0xc00162c400, {0x12e1120, 0xc00185e870}, {0x12e8340, 0xc000753380}, 0xc00[284](https://github.com/etcd-io/etcd/actions/runs/7463174417/job/20307221683?pr=17220#step:5:285)6b40, 0xc0028ac2d0, 0x1b0c680, 0x0)
    /home/runner/go/pkg/mod/google.golang.org/grpc@v1.60.1/server.go:1673 +0x16b8
google.golang.org/grpc.(*Server).handleStream(0xc00162c400, {0x12e8340, 0xc000753380}, 0xc002846b40)
    /home/runner/go/pkg/mod/google.golang.org/grpc@v1.60.1/server.go:1787 +0x12f0
google.golang.org/grpc.(*Server).serveStreams.func2.1()
    /home/runner/go/pkg/mod/google.golang.org/grpc@v1.60.1/server.go:1016 +0xa0
created by google.golang.org/grpc.(*Server).serveStreams.func2 in goroutine 19027
    /home/runner/go/pkg/mod/google.golang.org/grpc@v1.60.1/server.go:1027 +0x1d8

What did you expect to happen?

No panicking on processing any client requests

How can we reproduce it (as minimally and precisely as possible)?

Write an integration test to stop a member before call the snapshot api.

Anything else we need to know?

No response

Etcd version (please run commands below)

```console $ etcd --version # paste output here $ etcdctl version # paste output here ```

Etcd configuration (command line flags or environment variables)

# paste your configuration here

Etcd debug information (please run commands below, feel free to obfuscate the IP address or FQDN in the output)

```console $ etcdctl member list -w table # paste output here $ etcdctl --endpoints= endpoint status -w table # paste output here ```

Relevant log output

No response

fuweid commented 10 months ago

UPDATED:

When server closes the backend, the backend will stop background commit goroutine and reset read transaction.

https://github.com/etcd-io/etcd/blob/c8b4b168aff7f84d3b10dc9ddc13a381f15a710c/server/storage/backend/read_tx.go#L133-L138

For baseReadTx, it doesn't check the tx in UnsafeRange and update txWg.

https://github.com/etcd-io/etcd/blob/c8b4b168aff7f84d3b10dc9ddc13a381f15a710c/server/storage/backend/batch_tx.go#L367-L376

The tx can be set to nil during UnsafeRange.

serathius commented 8 months ago

Another case with https://github.com/etcd-io/bbolt/issues/715

serathius commented 8 months ago

@fuweid would you be able to propose a fix? The issue showed up in robustness tests, which I would prefer to keep flake free.

fuweid commented 8 months ago

Hi @serathius Sure. Will file pull request later.

fuweid commented 7 months ago

Hi @ahrtr @serathius

sorry for taking so long on this issue. This issue has been fixed by gRPC layer https://github.com/grpc/grpc-go/commit/61eab37838ce213237ecb31aa7cdf95241851431 (released by v1.61.0 to fix regression): All the requests are tracked by handlerWG wait group. When we call GracefulStop, it will block until all the inflight requests are closed, including streaming type RPC, like Watch/Snapshot/LeaseRenew.

We call gracefulstop when we received SIGTERM signal, except cmux-mode.

https://github.com/etcd-io/etcd/blob/a7f5d4b4e4569bd316277ebf1347785e0467c64d/server/embed/etcd.go#L474-L493

We don't need to setup timeout for draining things because GracefulStop always blocks until all the RPCs finished. Since GracefulStop isn't applied for cmux-mode, I filed a pull request https://github.com/etcd-io/etcd/pull/17790 to support cmux-mode's graceful shutdown. Even if we run ETCD without #17790, only ongoing Snapshot/Watch RPCs might panic, because we stop applier channel and scheduler before stop backend and all the unary RPCs won't use closed backend.

https://github.com/etcd-io/etcd/blob/a7f5d4b4e4569bd316277ebf1347785e0467c64d/server/etcdserver/server.go#L816-L827

And the https://github.com/etcd-io/etcd/pull/17757 is also enhancement for failpoint test. PTAL. Thanks

Side note: I was using old version (1.60.1) so that previous approach is to introduce txRef object to maintain reference count about all opening TX. The caller must call txPut to release the reference explict. If the backend has been closed, the ReadTx/ConcurrentReadTx/BatchTx should return closed error. However, in our codebase, both mvcc and auth layers assume the backend is always valid. The ReadTx/ConcurrenReadTx/BatchTx are kind of direct pointer. And UnsafeRange doesn't intend to return error. I tried to update all the interfaces to force it return error. It passes all the e2e and UT. However, there are too many changes.

If the server layer can track active RPCs, it will be better. So, I revisit the gRPC code and find that WaitForHandlers can help us.

type txRef struct {
     sync.RWMutex
     wg sync.WaitGroup
}

type Backend interface {
      ReadTx() (ReadTx, TxRefReleaseFunc, error)
      ConcurrentReadTx() (ReadTx, TxRefReleaseFunc, error)
      BatchTx() (BatchTx, TxRefReleaseFunc, error)
}

tx, txPut, err := ReadTx() // ConncurrentReadTx() / BatchTx()
if err != nil {
    return err
}
defer txPut()
..

etcd-io / etcd