Current element contract has no sybil attack protection

Therecanbeonlyone1969 commented 4 years ago

Proposal: To avoid Sybil attacks on the anchoring contract(s) in a permissionless setting, there needs to be an open registration function for Operators (account address, public key, ether stake, possibly commitment hash) to limit the number of writers to a contract. The stake slashing condition serves as a rate limit per Operator and needs to be chosen such that each Node can only submit a limited number of transactions per contract per Block.

Preliminaries for a staking protocol which are important from a Game Theory point of view:

The stake does not provide any monetary benefit for an Operator. This is important to avoid economic collusion scenarios.
Operators can only monetize their service by charging a fee from the requesters. Either per transaction or a recurring flat fee.

Since Operators are independent of one another, and each has its own income stream, the competition for business is expected to keep transaction fees low. Of course, oligopolies could form. However, the fact that there is no barrier to entry for an Operator, except a stake, should keep the market sufficiently diverse. A small caveat is that if the protocol were to become compute-heavy, it could discourage broad participation as the cost of computing might be high.

The slashing conditions for the protocol are as follows:

If an Operator submits more than 1 transaction per block the Operator’s stake is slashed and the public key of the Notary will be blacklisted. This is required not only as a rate limiter to avoid spamming but to also foster Operator diversity
If an Operator submits a transaction that has no verification data in the CAS network and the lack of data is detected by another Operator and is reported by that Operator, the offending Operator’s entire stake will be slashed and the public key of the Operator will be blacklisted.

Any registered Notary can submit a “whistleblower” transaction to the registration contract that proves that the accused Operator submitted a transaction without associated verification data. If the submitting Operator is registered and its signature is validated, the contract creates an event signaling a malicious action and a challenge period -- a certain number of blocks, say 100 or a 1000 blocks depending. A “whistleblower” transaction has the following elements

Account of Whistleblower Operator
Public key of Whistleblower Operator
Account of Malicious Operator
Public key of Malicious Notary
Malicious Entry: Roothash, URI, Generalized Timestamp
Proof Material: Data object of the query that shows NULL returned data together with a generalized time stamp of that query from the CAS network
Digital Signature of the Whistleblower Operator

In order to avoid malicious accusations of other Operators, there is a challenge period during which the accused Operatory or any other Operator can submit a counter proof with the same structure as the whistleblower transaction. The contract then randomly selects N registered Operators excluding the accused and the whistleblower as validators and creates an event to notify the registered Operators that they have been selected as validators and start the validation period, of for example another 100 or 1000 blocks. The selected validators have then until the end of the validation period to either agree or disagree with the challenge by submitting proof data in the same format as above. The contract counts the votes ensuring that the data content (hash of the data content) matches either the whistleblower or the accused submission and that the generalized timestamps are logically later than the original submissions. If M-of-N of the validators agree with the whistleblower, the accused Operator’s stake is slashed and the Operator is blacklisted. M can be any number such that the condition could be 2-of-3 or 2-of-5. If there are less than M validators agreeing with the whistleblower, then the whistleblower’s stake is slashed and its public key blacklisted.

The staking protocol works could work as follows:

An Operator submits a registration transaction to the anchoring contract. The required permissioning contract validates that either the account or the public key has not already been registered or blacklisted. If they are, the registration is rejected. The contract then validates the signature, records the stake, and generates a success message as an event.
Every time an Operator submits a transaction, the permissioning contract checks if the block number of this transaction is the same as the block number of the last recorded transaction from the Operator. If the block numbers match, the slashing condition is enforced and the stake of the Operator is "burned" by sending the Ether used as a stake to the zero address. The Operators's account is moved to the blacklist and removed from the registry of active Operators. If the block numbers do not match, the entry for the Operator is updated with the new block number in order to keep track of an Operators' submission.

If a Whistleblower, Challenge or Validation transaction is submitted, the permissioning contract checks if the submitting Operatory is registered, and if the digital signature is valid. If both checks are passed, the contract follows the Whistleblower process outlined above.

When an Operator wants to unbond/unregister, the permissioning contract unregisters the Operator's address. It then waits until the block containing the unbonding transaction and at least two subsequent blocks have been finalized, and then sends the stake back to the account of the original Operator registration.

One of the key Game Theoretic construction insights in the staking mechanism is that there is no economic gain in trying to game the protocol since the only economic gain for a Notary is from the fees extracted from customers such as Alice and Bob.

OR13 commented 4 years ago

A Sybil attack is when a single adversary manifests as many identities to obstruct a protocol, or corner resources for themselves... before we consider how to stop this in ethereum, we must address the sidetree core protocol: https://identity.foundation/sidetree/spec/#proof-of-fee

We should also make a distinction between a "node operator" someone who has a hot wallet and anchors data to the contract, and a "did controller" someone who signs transactions and passes them to a node operator.

There is no defense against sybil attacks at the did controller level, because there is 0 cost and no correlation factors for each new identity.... a node operator throttling inbound requests is the closest we can get.

At the node operator level, each node operator has a hot wallet, and they use it to pay transaction fees to have their message stored on a blockchain....

How do you know that 2 node operators are not the same person?

Its not possible to know.

Given that it's not possible to know if 2 node operators are the same person, what value does proof of fee provide in sidetree core?

If Microsoft or some other wealthy corporation wants to anchor really really large batches, they can stake funds (in Bitcoin), and then honest sidetree nodes can use that information to allow for larger file sizes...

in other words... proof of fee has nothing to do with sybil, it's a pay to play at scale scheme.

So whats the worst case scenario attack of sidetree?

it's a massive sybil army NOT using "proof of fee", filling up the ledger with pure creates.

We can't use some fee scheme to slow them down, because they never anchor anything large enough to hit the proof of fee threshold (so they are never forced to "pay extra" to attack), and we can't distinguish them, because every anchor call they make is from a brand new address.

So now the problem has reduced to: how to trust a brand new address, when you can't see the size of the thing it's anchoring in the contract.

There are a couple approaches here...

Approach 1

We could make the contract take a massive fee, making it so expensive to get your batch anchored, that only honest wealthy people could afford to anchor DIDs for others... (feudalism ftw!)... I'm joking, but this is actually the first approach that should be implemented IMO, because while its incredibly biased in favor of wealthy parties... its also simple to evaluate.

Approach 2

Implement an interactive verification game... ala Truebit. Where node operators advertise jobs, and then a massively complicated interactive game takes place, whereby multiple parties are selected at random, Proof of Storage / Time is solved, and then protocol conformance is established, and a label is applied to the anchor job. Passing means stake is returned (less the nontrivial cost of orchestration), failing means stake is lost (challengers get rewarded, but carefully to prevent, self attack bounties, etc...)... having worked with the brilliant Truebit folks in the past.... I fear this approach, but I'm somewhat familiar with it.

Approach 3

Take some state channel software off the shelf and bet the farm on it. After all, we just need some number of parties to agree that some number of parties played by the rules (that they paid for what they anchored).

Which ethereum state channel solutions has protected the largest amount of funds for the longest amount of time.... and just copy that approach.... replacing the current "on chain contract state" with an off chain state channel... another layer of virtualization, but with some portability over the approach taken by the ION team with bitcoin.

We've known about approach 1,2,3 for a long time... and to be honest... we have always hated all of them.... but of all of them, 3 is the most desirable for an architecture perspective for element. It creates a layer of indirection between "ethereum" and "sidetree"... allows us to have more portability by leveraging some generic, and its general purpose so it either reinforces that state channels are useful thing, or it shows that they aren't by demonstrating that they don't work / can't solve a simple authenticated CRDT protocol problem.... similar to how having p-256 defend billions of dollars of value could have helped convince us that its not back-doored by NSA (thanks satoshi).

So let's tackle this issue not from the perspective of "oh joy lets write another smart contract", but from the perspective of "which ethereum backed company solved this problem", and "lets use that instead".

Its been a while since I was at DevCon, whats the state of the art on this front these days?

Therecanbeonlyone1969 commented 4 years ago

I generally agree with you. Summary of state of the art is here -- https://medium.com/matter-labs/optimistic-vs-zk-rollup-deep-dive-ea141e71e075

Since we do not have to worry about digital assets things get simpler. BTW, I am not utilizing something I cooked up and is not proven. The above, albeit simplified, validator approach to state commits is well established for e.g. Plasma or Optimistic rollup operators during a challenge period.

And the important point to note here is that there is a rule that a single operator can only submit a certain number transactions per block e.g. 1. And one can even go one step further by saying that the contract allows only 1 transaction per block based on the blocknumber (you simply store blocknumber with the last transaction number and operator address) and then you can always prevent that you are anchoring a large number of hashes -- while the DID ops in each anchor file might be large (up to the node operator to manage). Then you need a front running attack at the client level aka collusion with a miner. The interesting side effect you have when you limit the number of transactions per block for the contract is that batch processing times and thus batch sizes will be limited to the blockrate (a little less actually), since it now becomes a race with other operators to get your batch transaction in before any of the other operators if you e.g. allow only 1 contract transaction per block.

And yes, let's start with a simple stake, relatively, large stake and simple slashing/rate-limiting conditions.

OR13 commented 4 years ago

@Therecanbeonlyone1969 awesome, we'll totally take a set of truffle contract tests for an improvement when / if you have time. lets work our way towards the state of the art.

decentralized-identity / element