omni-network / omni

Monorepo for Omni node, contracts and other related tools
https://omni.network
GNU General Public License v3.0
81 stars 46 forks source link

fix(attest): don't delete finalized attestations needed by fuzzy chain #1384

Closed arajasek closed 2 months ago

arajasek commented 3 months ago

There's a dangerous edge-case in the deletion logic due to the inter-dependence of fuzzy and finalized chains. The deletion logic is careful not to delete the latest approved attestation for all chains. However, in an edge-case, the latest approved attestation for a fuzzy chain might be overridden by a finalized attestation that gets deleted. Consider the following case:

The latest attestation offsets we will calculate are 10 for the fuzzy chain, and 11 for the finalized chain. As a result, we will not delete A (it's a fuzzy attestation with offset = 10), and will not delete C (it's a finalized attestation with offset = 11), but we WILL delete B (since, from the perspective of the finalized chain, it's stale). This is a bug, since A links to B, but B has now been deleted. It leads to crashes such as this:

4-06-25 17:13:44.263 ERRO Finalize req failed [BUG]                height=29 err="latest approved: get finalized attestation: not found" stacktrace="[errors.go:39 keeper.go:473 keeper.go:258 keeper.go:560 module.go:85 module.go:799 app.go:170 baseapp.go:792 abci.go:819 abci.go:884 cmt_abci.go:44 abci.go:94 local_client.go:185 app_conn.go:104 execution.go:213 state.go:1771 state.go:1682 state.go:1617 state.go:1655 state.go:2334 state.go:2066 state.go:929 state.go:836 asm_arm64.s:1222]"
24-06-25 17:13:44.263 ERRO error in proxyAppConn.FinalizeBlock      module=state err="latest approved: get finalized attestation: not found" stacktrace="[errors.go:39 keeper.go:473 keeper.go:258 keeper.go:560 module.go:85 module.go:799 app.go:170 baseapp.go:792 abci.go:819 abci.go:884 cmt_abci.go:44 abci.go:94 local_client.go:185 app_conn.go:104 execution.go:213 state.go:1771 state.go:1682 state.go:1617 state.go:1655 state.go:2334 state.go:2066 state.go:929 state.go:836 asm_arm64.s:1222]"
24-06-25 17:13:44.263 ERRO CONSENSUS FAILURE!!!                     module=consensus err="failed to apply block; error latest approved: get finalized attestation: not found"

I'm not sure what the right thing to do here is. What I would like to do is to change the latestAttestation method to always check the finalized chain (when called with a fuzzy chain) for a later attestation (and return that if found). This is the correct value to be considered as the "latest" attestation. However, I'm concerned that some callers of this method may be confused by this change of behaviour.

arajasek commented 2 months ago

ADR: https://www.notion.so/omni-network/Keep-fewer-attestations-in-state-62f28f9367f24782a06af256c2348224