Open dlongley opened 6 years ago
This is what was stated in the ledger-node issue
This may need to be filed under specific consensus algorithms instead (or as well)... but essentially, if operations are ignored due to validation errors, those errors must have a characteristic such that they will always happen regardless of external factors such as database or network access failures. Otherwise an inconsistent state could be introduced.
I'm not sure what action should be taken here. @dlongley can you make specific recommendations?
This is information related to operation validation on the input side of things. The records returned from the legerNode.records.get
API is built from only those operations that have achieved consensus and are correct. Invalid operations are not 'ignored' due to validation errors, those operations are never allowed to enter the network in the first place. A malicious or improperly implemented node will not be able to propagate invalid operations via gossip due to validation occurring during the gossip process.
Research indicates that temporal validation errors due to low level storage failures may result in a new operation or gossip being rejected, but these failures do not result in any permanent inconsistent state on the node.
Operations arriving via HTTP on a ledgerAgent are immediately sent to the ledgerNode.operations.add
API:
https://github.com/digitalbazaar/bedrock-ledger-agent/blob/master/lib/http.js#L290
The ledgerNode.operations.add
API passes the operation immediately through the validator where an error is thrown
if valid === false
. If the error is temporal in nature, the user may retry submitting their event at a later time.
https://github.com/digitalbazaar/bedrock-ledger-node/blob/b5858a97d16de9ce082fd70bdb7c111cb7842c1e/lib/LedgerNodeOperations.js#L39-L44
During Continuity gossip, operations travel through the addBatch
API
https://github.com/digitalbazaar/bedrock-ledger-consensus-continuity/blob/d37411b5d6254c7492af575812e8a62e8ab6ac16/lib/agents/gossip-agent.js#L76-L77
In the addBatch API, the ledgerNode validate
API may return {valid: false, error}
for a variety of reasons.
The error could be a proper validation error or errors captured from other low level APIs. Validator code is not designed
to throw, but that may occur in some rare instances. Whatever the case, if valid === false
the addBatch
API throws here:
https://github.com/digitalbazaar/bedrock-ledger-consensus-continuity/blob/d37411b5d6254c7492af575812e8a62e8ab6ac16/lib/events.js#L143-L144
An error thrown in the addBatch
API is caught here, which results in the termination of the gossip session.
https://github.com/digitalbazaar/bedrock-ledger-consensus-continuity/blob/d37411b5d6254c7492af575812e8a62e8ab6ac16/lib/agents/gossip-agent.js#L109-L114
A gossip session that terminates due to an error like this results in the gossip peer being backed off, the gossip operation will be retried later at increasing intervals. If the gossip was initially rejected due to some temporal failure on the local node, it should succeed during some later attempt, possibly after intervention by a SysAdmin.
See: https://github.com/digitalbazaar/bedrock-ledger-node/issues/25