vegaprotocol / vega

A Go implementation of the Vega Protocol, a protocol for creating and trading derivatives on a fully decentralised network.
https://vega.xyz
GNU Affero General Public License v3.0
36 stars 22 forks source link

[Bug]: Panic during checkpoint restart due to successor markets #8772

Closed jgsbennett closed 1 year ago

jgsbennett commented 1 year ago

Problem encountered

I can see a panic in the checkpoint restart code related to successor markets if I run all successor markets tests from jake_successor_markets test branch twice. (Guessing one of the later tests does something that the LNL test earlier in the run doesn't like, but I'll try to narrow that down).

Observed behaviour

Panic on second attempt

Expected behaviour

No panics. Two successful runs back to back

Steps to reproduce

1. pytest -m successor_markets
2. pytest -m successor_markets

Software version

03b02dc9fed3ba9d24a912426345ad179d45fc79

Failing test

No response

Jenkins run

No response

Configuration used

No response

Relevant log output

panic: runtime error: invalid memory address or nil pointer dereference [recovered]
    panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x2c88265]

goroutine 215 [running]:
code.vegaprotocol.io/vega/cmd/vega/commands/node.(*Command).Run.func1.1()
    /workspace/vega/cmd/vega/commands/node/node.go:137 +0x59
panic({0x48998a0, 0x7442c20})
    /usr/local/go/src/runtime/panic.go:884 +0x212
code.vegaprotocol.io/vega/core/governance.validateSuccessorMarket(0xc0035e2ec8, 0xc002e2ea60)
    /workspace/vega/core/governance/market.go:650 +0x85
code.vegaprotocol.io/vega/core/governance.validateNewMarketChange(0xc0035e2ec8, {0x57d14b0?, 0xc000244840}, 0x0?, {0x57d5388, 0xc00030df80}, 0x0?, 0xc002e2ecf0, 0x0?)
    /workspace/vega/core/governance/market.go:622 +0xcb
code.vegaprotocol.io/vega/core/governance.(*Engine).intoToSubmit(0xc000244f00, {0x57bdfc8, 0xc0012deba0}, 0xc0034e3600, 0xedc3f7daf?, 0x1)
    /workspace/vega/core/governance/engine.go:622 +0x318
code.vegaprotocol.io/vega/core/governance.(*Engine).Load(0xc000244f00, {0x57bdfc8, 0xc0012deba0}, {0xc00305c000, 0xabc4, 0xc000})
    /workspace/vega/core/governance/checkpoint.go:105 +0x799
code.vegaprotocol.io/vega/core/checkpoint.(*Engine).load(0xc00049ee10, {0x57bdfc8, 0xc0012deba0}, 0xc002e2f8f8)
    /workspace/vega/core/checkpoint/engine.go:337 +0xaf9
code.vegaprotocol.io/vega/core/checkpoint.(*Engine).UponGenesis(0xc00049ee10, {0x57bdfc8, 0xc0012deba0}, {0xc002124000, 0x1cde3, 0x1e000})
    /workspace/vega/core/checkpoint/engine.go:171 +0x4a8
code.vegaprotocol.io/vega/core/genesis.(*Handler).OnGenesis(0xc0015ca640, {0x57bdfc8, 0xc0012deba0}, {0x77555a0?, 0x46ae020?, 0x0?}, {0xc002124000, 0x1cde3, 0x1e000})
    /workspace/vega/core/genesis/handler.go:64 +0x3bd
code.vegaprotocol.io/vega/core/processor.(*App).OnInitChain(0xc000221200, {{0x19fa7d18, 0xedc3f7b2f, 0x0}, {0xc000623cd0, 0xb}, 0xc0010d2240, {0xc0012de9c0, 0x2, 0x2}, ...})
    /workspace/vega/core/processor/abci.go:708 +0x4ea
code.vegaprotocol.io/vega/core/blockchain/abci.(*App).InitChain(0xc0016cc160, {{0x19fa7d18, 0xedc3f7b2f, 0x0}, {0xc000623cd0, 0xb}, 0xc0010d2240, {0xc0012de9c0, 0x2, 0x2}, ...})
    /workspace/vega/core/blockchain/abci/abci.go:39 +0xbb
code.vegaprotocol.io/vega/cmd/vega/commands/node.(*appW).InitChain(0x18?, {{0x19fa7d18, 0xedc3f7b2f, 0x0}, {0xc000623cd0, 0xb}, 0xc0010d2240, {0xc0012de9c0, 0x2, 0x2}, ...})
    /workspace/vega/cmd/vega/commands/node/app_wrapper.go:60 +0x7d
github.com/tendermint/tendermint/abci/client.(*localClient).InitChainSync(0xc00108aba0, {{0x19fa7d18, 0xedc3f7b2f, 0x0}, {0xc000623cd0, 0xb}, 0xc0010d2240, {0xc0012de9c0, 0x2, 0x2}, ...})
    /home/jake-vega/go/pkg/mod/github.com/vegaprotocol/cometbft@v0.34.28-0.20230322133204-3d8588de736e/abci/client/local_client.go:272 +0x118
github.com/tendermint/tendermint/proxy.(*appConnConsensus).InitChainSync(0x1999971?, {{0x19fa7d18, 0xedc3f7b2f, 0x0}, {0xc000623cd0, 0xb}, 0xc0010d2240, {0xc0012de9c0, 0x2, 0x2}, ...})
    /home/jake-vega/go/pkg/mod/github.com/vegaprotocol/cometbft@v0.34.28-0.20230322133204-3d8588de736e/proxy/app_conn.go:77 +0x55
github.com/tendermint/tendermint/consensus.(*Handshaker).ReplayBlocks(_, {{{0xb, 0x1}, {0x4ecaa39, 0x7}}, {0xc000623cd0, 0xb}, 0x1, 0x0, {{0x0, ...}, ...}, ...}, ...)
    /home/jake-vega/go/pkg/mod/github.com/vegaprotocol/cometbft@v0.34.28-0.20230322133204-3d8588de736e/consensus/replay.go:319 +0xd78
github.com/tendermint/tendermint/consensus.(*Handshaker).Handshake(0xc002e30d90, {0x57ed218, 0xc000ef7c70})
    /home/jake-vega/go/pkg/mod/github.com/vegaprotocol/cometbft@v0.34.28-0.20230322133204-3d8588de736e/consensus/replay.go:268 +0x3d4
github.com/tendermint/tendermint/node.doHandshake({_, _}, {{{0xb, 0x0}, {0x4ecaa39, 0x7}}, {0xc000623cd0, 0xb}, 0x1, 0x0, ...}, ...)
    /home/jake-vega/go/pkg/mod/github.com/vegaprotocol/cometbft@v0.34.28-0.20230322133204-3d8588de736e/node/node.go:329 +0x1b8
github.com/tendermint/tendermint/node.NewNode(0xc00117f680, {0x57b5d60, 0xc0016c4640}, 0xc00105bf00, {0x5798680, 0xc000365578}, 0xc000079400?, 0x1?, 0xc00105bf40, {0x57bc470, ...}, ...)
    /home/jake-vega/go/pkg/mod/github.com/vegaprotocol/cometbft@v0.34.28-0.20230322133204-3d8588de736e/node/node.go:779 +0x597
code.vegaprotocol.io/vega/core/blockchain/abci.NewTmNode({{0x0}, 0x1, 0x1, 0x0, 0x0, {0xc000f964b0, 0xa}, {{0x0}, {0xc000630228, 0x15}}, ...}, ...)
    /workspace/vega/core/blockchain/abci/tm_node.go:83 +0x705
code.vegaprotocol.io/vega/cmd/vega/commands/node.(*Command).startABCI(0xc000580800, 0x0?, {0x57e6b20, 0xc0004aca60}, {0x7fffd6df3f15, 0x2f}, {0x0?, 0x0?}, {0x0, 0x0})
    /workspace/vega/cmd/vega/commands/node/node.go:412 +0x1b6
code.vegaprotocol.io/vega/cmd/vega/commands/node.(*Command).startBlockchain(0xc000580800, 0x0?, {0x7fffd6df3f15?, 0x0?}, {0x0, 0x0}, {0x0, 0x0})
    /workspace/vega/cmd/vega/commands/node/node.go:299 +0x3ab
code.vegaprotocol.io/vega/cmd/vega/commands/node.(*Command).Run.func1()
    /workspace/vega/cmd/vega/commands/node/node.go:140 +0x90
created by code.vegaprotocol.io/vega/cmd/vega/commands/node.(*Command).Run
    /workspace/vega/cmd/vega/commands/node/node.go:131 +0x66f
jgsbennett commented 1 year ago

I think this is the checkpoint that is being processed at the point this all goes wrong: 20230711165044-875-10e97a79e2c10039375e23172427766b88640648b30b4673863e6a714179194b.zip

jgsbennett commented 1 year ago

Can reproduce the failure by rerunning this job: https://jenkins.ops.vega.xyz/blue/organizations/jenkins/common%2Fsystem-tests-wrapper/detail/system-tests-wrapper/83024/pipeline/506 This runs exactly two tests in a very specific order, which reproduces the issue. (In case the job disappears from history, it is achieved by:

jgsbennett commented 1 year ago

This also showed up in the snapshot soak run: https://jenkins.ops.vega.xyz/blue/organizations/jenkins/common%2Fsnapshot-soak-tests/detail/snapshot-soak-tests/8475/pipeline

jgsbennett commented 1 year ago

The offending test is now skipped in: https://github.com/vegaprotocol/system-tests/pull/2224 So that we can get those new tests merged, so you'll need to turn it back on in a branch to see this fail.