omni-network / omni

Monorepo for Omni node, contracts and other related tools
https://omni.network
GNU General Public License v3.0
81 stars 46 forks source link

feat(attest): keep fewer attestations in state #1343

Closed arajasek closed 2 months ago

arajasek commented 3 months ago

See here for a detailed discussion. We need to:

There will likely be at least a few more issues revealed by doing this. Here's an example failure I ran into while experimenting with running the CI test suite with this change (not investigated yet):

4-06-25 17:13:44.263 ERRO Finalize req failed [BUG]                height=29 err="latest approved: get finalized attestation: not found" stacktrace="[errors.go:39 keeper.go:473 keeper.go:258 keeper.go:560 module.go:85 module.go:799 app.go:170 baseapp.go:792 abci.go:819 abci.go:884 cmt_abci.go:44 abci.go:94 local_client.go:185 app_conn.go:104 execution.go:213 state.go:1771 state.go:1682 state.go:1617 state.go:1655 state.go:2334 state.go:2066 state.go:929 state.go:836 asm_arm64.s:1222]"
24-06-25 17:13:44.263 ERRO error in proxyAppConn.FinalizeBlock      module=state err="latest approved: get finalized attestation: not found" stacktrace="[errors.go:39 keeper.go:473 keeper.go:258 keeper.go:560 module.go:85 module.go:799 app.go:170 baseapp.go:792 abci.go:819 abci.go:884 cmt_abci.go:44 abci.go:94 local_client.go:185 app_conn.go:104 execution.go:213 state.go:1771 state.go:1682 state.go:1617 state.go:1655 state.go:2334 state.go:2066 state.go:929 state.go:836 asm_arm64.s:1222]"
24-06-25 17:13:44.263 ERRO CONSENSUS FAILURE!!!                     module=consensus err="failed to apply block; error latest approved: get finalized attestation: not found"
  stack=
  │ goroutine 47 [running]:
  │ runtime/debug.Stack()
  │ \t/opt/homebrew/Cellar/go/1.22.1/libexec/src/runtime/debug/stack.go:24 +0x64
  │ github.com/cometbft/cometbft/consensus.(*State).receiveRoutine.func2()
  │ \t/Users/aayushrajasekaran/go/pkg/mod/github.com/cometbft/cometbft@v0.38.7/consensus/state.go:801 +0x44
  │ panic({0x1bc6d60?, 0x400199f880?})
  │ \t/opt/homebrew/Cellar/go/1.22.1/libexec/src/runtime/panic.go:770 +0x124
  │ github.com/cometbft/cometbft/consensus.(*State).finalizeCommit(0x40025b8008, 0x1d)
  │ \t/Users/aayushrajasekaran/go/pkg/mod/github.com/cometbft/cometbft@v0.38.7/consensus/state.go:1780 +0xacc
  │ github.com/cometbft/cometbft/consensus.(*State).tryFinalizeCommit(0x40025b8008, 0x1d)
  │ \t/Users/aayushrajasekaran/go/pkg/mod/github.com/cometbft/cometbft@v0.38.7/consensus/state.go:1682 +0x26c
  │ github.com/cometbft/cometbft/consensus.(*State).enterCommit.func1()
  │ \t/Users/aayushrajasekaran/go/pkg/mod/github.com/cometbft/cometbft@v0.38.7/consensus/state.go:1617 +0x8c
  │ github.com/cometbft/cometbft/consensus.(*State).enterCommit(0x40025b8008, 0x1d, 0x0)
  │ \t/Users/aayushrajasekaran/go/pkg/mod/github.com/cometbft/cometbft@v0.38.7/consensus/state.go:1655 +0xabc
  │ github.com/cometbft/cometbft/consensus.(*State).addVote(0x40025b8008, 0x4000ad6f70, {0x4000962570, 0x28})
  │ \t/Users/aayushrajasekaran/go/pkg/mod/github.com/cometbft/cometbft@v0.38.7/consensus/state.go:2334 +0x1848
  │ github.com/cometbft/cometbft/consensus.(*State).tryAddVote(0x40025b8008, 0x4000ad6f70, {0x4000962570?, 0x1455de4?})
  │ \t/Users/aayushrajasekaran/go/pkg/mod/github.com/cometbft/cometbft@v0.38.7/consensus/state.go:2066 +0x28
  │ github.com/cometbft/cometbft/consensus.(*State).handleMsg(0x40025b8008, {{0x25d7b80, 0x400007d370}, {0x4000962570, 0x28}})
  │ \t/Users/aayushrajasekaran/go/pkg/mod/github.com/cometbft/cometbft@v0.38.7/consensus/state.go:929 +0x300
  │ github.com/cometbft/cometbft/consensus.(*State).receiveRoutine(0x40025b8008, 0x0)
  │ \t/Users/aayushrajasekaran/go/pkg/mod/github.com/cometbft/cometbft@v0.38.7/consensus/state.go:836 +0x2c4
  │ created by github.com/cometbft/cometbft/consensus.(*State).OnStart in goroutine 150
  │ \t/Users/aayushrajasekaran/go/pkg/mod/github.com/cometbft/cometbft@v0.38.7/consensus/state.go:398 +0xf0
arajasek commented 3 months ago

Sanity-check: confirm that trimLag being set to 1 still actually keeps one block's worth of attestations.

arajasek commented 2 months ago

Update: We've unearthed a few more issues with the deletion logic. We're currently trying to identify all cases when an attestation should be protected from deletion, and discussing implementation details.

arajasek commented 2 months ago

Closing in favour of https://github.com/omni-network/omni/issues/1384, where work is being discussed further.