ripple / validator-history-service

Service for ingesting, aggregating, storing, and disbursing XRP Ledger validation related data.
ISC License
14 stars 8 forks source link

fix: update chain when a network resets #129

Closed pdp2121 closed 11 months ago

pdp2121 commented 11 months ago

High Level Overview of Change

When a network resets, there will be no major issue with the existing APIs (except amendments for that network has to be manually updated, which would be in a separate ticket after the amendments PR has been merged).

One minor issue with the existing process is that, if after the reset, validations come in and the chain has not been purged (purging happens every hour if the chain has not been updated for an hour, so there is still a chance that this is the case), then new ledgers would not be added into the chain for agreement calculation, hence, some agreement scores will be missed until the chain has been purged and a new chain is added. Hence, agreement score might still be calculated based on data before reset only. (Existing code for reference:

   if (chainWithThisValidator !== undefined) {
      const skipped = ledger.ledger_index - chainWithThisValidator.current
      log.warn('Possibly skipped ${skipped} ledgers')
      // all new validations after reset will not satisfy this condition and will not be added
      // until the chain is purged due to inactivity.
      if (skipped > 1 && skipped < 20) {
        chainWithThisValidator.incomplete = true
        addLedgerToChain(ledger, chainWithThisValidator)
      }
    }

    if (chainWithThisValidator !== undefined || chainWithLedger !== undefined) {
      return
    }

)

The fix for this would be:

Type of Change

Test Plan

Tests passed. The APIs run normally on staging for a few days.

ckniffen commented 11 months ago

One thing I just realized is that agreement does not seem to check the hash the validator gives for a ledger and what the validated ledger hash was. Is that logic somewhere else? If it is not then we are not tracking agreement but rather that a validator is sending out validations.

pdp2121 commented 11 months ago

We decided that the loss of data for an hour or two is a small issue, and my fix is assuming that all validators function normally (if there's a validator that is way behind on its validations or a misconfigured validator that validates on more than one networks, my fix would break and the chain would be reset, which would be an undesirable behavior). We would disregard this change and move forward as is.