optakt / flow-dps

Flow Data Provisioning Service
Apache License 2.0
29 stars 13 forks source link

Live: sanity check sealed state commitment #395

Closed awfm9 closed 3 years ago

awfm9 commented 3 years ago

When a block is finalized, we should go through all of its seals and double check in the index database if we indexed the right state commitment, which originally came from the execution node's data record. If there is a mismatch, it means that the execution node generated a wrong execution result and we should crash (the Flow system currently doesn't handle that at all, afaik).

Ullaakut commented 3 years ago

@awfm9 Is it fine for you if I create a component specifically for this sanity check? It would be a very simple and small one that has access to the index and has a Check method. I think it's the cleanest way to do this, as I don't think this sanity check fits with the trackers as they are (since they should not read the index IMO) and it would also clutter the mapper.

// Checker compares the seals that were indexed with the ones that are found in finalized blocks.
// Its goal is to act as a sanity check to ensure that no wrong execution result was computed.
// Its database dependency is the DPS index.
type Checker struct {
    reader  index.Reader
    storage storage.Library
}

// New creates a new sanity checker that uses the given storage library and index database.
func New(reader index.Reader, storage storage.Library) *Checker {
    c := Checker{
        reader:  reader,
        storage: storage,
    }

    return &c
}

// Check verifies the seals within the given record by comparing them against the indexed ones.
func (c *Checker) Check(record *uploader.BlockData) error {

    for _, seal := range record.Block.Payload.Seals {
        // Retrieve the block height for that seal.
        height, err := c.reader.HeightForBlock(seal.BlockID)
        if err != nil {
            panic(err) // FIXME
        }

        // Retrieve the state commitment indexed for that block.
        commit, err := c.reader.Commit(height)
        if err != nil {
            panic(err) // FIXME
        }

        // If it does not match with the sealed state commitment, return a mismatch error.
        if seal.FinalState != commit {
            return fmt.Errorf("state mismatch for block %x at height %d", seal.BlockID, height)
        }
    }

    return nil
}

This component would be a dependency of the consensus tracker, which would be able to call it upon a block being finalized. Now the issue I have with this is that currently when a block is finalized, the consensus tracker just updates its latest height and does not retrieve the record directly. The record is only retrieved from the exec tracker once we want to index things.

Therefore, is it fine for you if we also grab the record in the OnBlockFinalized callback, and do the sanity check there?

It would look something like this:

// OnBlockFinalized is a callback that notifies the consensus tracker of a new
// finalized block.
func (c *Consensus) OnBlockFinalized(blockID flow.Identifier) {

    var header flow.Header
    err := c.db.View(operation.RetrieveHeader(blockID, &header))
    if err != nil {
        c.log.Error().Err(err).Hex("block", blockID[:]).Msg("could not get header")
        return
    }

    record, err := c.hold.Record(blockID)
    if err != nil {
        c.log.Error().Err(err).Hex("block", blockID[:]).Msg("could not retrieve record")
        return
    }

    err = c.sanity.Check(record)
    if err != nil {
        c.log.Fatal().Err(err).Hex("block", blockID[:]).Msg("execution state mismatch")
    }

    c.last = header.Height

    c.log.Debug().Hex("block", blockID[:]).Uint64("height", header.Height).Msg("block finalization processed")
}

The issue I have with this is I'm not sure how we should deal with the case where we fail to retrieve elements from the index.

awfm9 commented 3 years ago

@awfm9 Is it fine for you if I create a component specifically for this sanity check? It would be a very simple and small one that has access to the index and has a Check method. I think it's the cleanest way to do this, as I don't think this sanity check fits with the trackers as they are (since they should not read the index IMO) and it would also clutter the mapper.

This looks overcomplicated to me. I don't know what you are doing with the record there? All you need to do is:

I think it's OK to do it in a new component, and register the component separately as a block finalized consumer. When retrieval of any of these fails, or the commits don't match, we should use log.Fatal.

awfm9 commented 3 years ago

There is an issue with implementing this; we can't really check what we have indexed until the Badger transaction writing it is committed. If we start committing every block, we lose a lot of performance when indexing, especially for past sporks. We will have to rethink all of this a bit.