staheri14 commented 3 years ago

Problem

In the waku-rln-relay, group member keys are represented as a Merkle tree. Users need the tree root and their auth paths for proof generation and verification. In our current implementation, we keep the set of user's keys (more precisely the id commitment keys) in the state of a smart contract. However, we do not store the tree root on the contract since it would cause higher gas costs for the member insertion and deletion. This design choice (no root on the smart contract) is in contrast to the existing applications that utilize Merkle tree for a similar purpose e.g., Tornado Cash. They typically store the tree root on the contract. As such, there is this concern that the lack of the Merkle tree root may cause some future issues which we are not aware of at the moment. Additionally, this custom design may also affect the solution generality e.g., the rln idea can be similarly used to enable other applications like private reputation. However, a custom design may prevent us from utilizing the existing rln-relay development for other use-cases.

The goal of this issue is to investigate and identify the trade-offs between our current design (not having tree root on the contract state) and other existing solutions.

Possible trade-offs

Below, I have sketched some of the aspects that would be directly impacted by the amount of information (about the Merkle tree) stored on the state of the smart contract.

Gas consumption
User storage overhead
User bandwidth overhead
User computation overhead
Security: does the design enable access to the most recent and correct Merkle tree root (the correctness of auth path can be verified by having the correct root) or is it prone to fraud
Operations that can be supported i.e., member deletion and insertion: Some solutions do not support member deletion
The computation and bandwidth costs associated with bootstrapping a new member: How much resources a new member should spend to get access to/construct the tree root and its auth path.

What we know

Below is the initial analysis of the current design of waku-rln-relay (which we know so far). There might be more efficient approaches that we shall investigate as part of the present issue.

Method A: No root on the chain:

This is the solution deployed in waku-rln-relay:

State of the contract:
- A map of users identity commitment keys
- No root
The root and auth paths are not available publicly.
For insertion, the new user sends its key to the contract and also locks some funds in the contract. The smart contract stores it in the list of members.
For deletion, the user sends the secret key sk of the deleted member to the contract. The inclusion of the sk is checked by verifying whether H(sk) belongs to the membership list. If so, then the corresponding value is replaced with a null value. The fund of the slashed member is also rewarded to the deleting user.
Each group member stores the entire Merkle tree locally. It results in the storage of 2*N-1hash values where N=2^d is the number of tree leaves (maximum number of members) in a tree with depth d. See this issue for more info.
Existing members listen to the state of contract for the new events e.g., member insertion and deletion. They update their local tree accordingly.
This solution does not have any security issues. Each user has a copy of the tree so can be sure its root and auth path are valid and recent (as long as the state of the smart contract is accessible).
Bootstrapping: every new joining node needs to fetch all the transactions sends to the smart contract (inserted and deleted keys) to be able to construct the entire tree from scratch i.e., to calculate its auth path and the root. It imposes the download of 2*N member keys (tree leaves) and conducting N hash operations in the worst case.

The following table summarizes the result of our analysis for Method A. The gas consumption results are borrowed from https://hackmd.io/JoxnlDq3RT6WhtA-KBxtYg?both#A-Pubkey-map. Parameters in the table should be interpreted as below:

d is the tree depth
H is the size of the hash output
N is the number of Merkle tree leaves which is 2^d
The gas consumption is for d=32
A Batch consists of B=128 keys.

Method	Gas cost for registration	Gas cost for deletion	Storage per user	User computation per update	Bandwidth cost per update	Security	Supported operations	Bootstrapping computation cost	Bootstrapping Bandwidth
A: Offchain root	Batch: 20k, Single: 40k	Batch: 20k, Single: 40k	full tree: `d`=32 274 GB, `d`=20 67 MB partial tree: `d`=32 2.048 KB, `d`=20 0.128 KB	O(log(`d`)) to recalc root and auth path	`H`	None	Insertion and Deletion	`N` hashing	O(`N`)

What we do not know

There are a few other solutions (including some L2 solutions) in the following document provided by the rln team. However, the doc only provides the gas consumption of each solution and does not cover other aspects we listed in this issue. The gas consumption analysis in the doc is also limited to the tree depth 32 (though d=20 is a more reasonable value). I am going to work on those solutions and provide a similar analysis (as waku-rln-relay) to have a more clear picture of their performance and trade-offs.
There are various L2 solutions out there that may result in better performance or security. This is also something to spend time on.
The use of the Verkle tree and other proof systems is another research path to consider (suggested in https://github.com/vacp2p/research/issues/57#issuecomment-793552712).

barryWhiteHat commented 3 years ago

For deletion, the user sends the secret key sk of the deleted member to the contract. The inclusion of the sk is checked by verifying whether H(sk) belongs to the membership list. If so, then the corresponding value is replaced with a null value.

Seems to be a tragedy of the commons issue here. what is the incentive for me to slash a user who is spamming the network ?

I guess my preferred solution is to use the p2p network for slashing and pass around the user private key + merkle proof for the updated tree. So slashing happen in p2p network.

On chain root does seem important:

If you want to make a on chain root we can do this using a ZKP to confirm every slashing and calculate the new merkle root. This would be much cheaper because you would have many slashing in a single tx. Thsi requires someone to store the whole merkle tree. I am thinking that every user does not need this maybe just power user.

staheri14 commented 3 years ago

Thanks a lot @barryWhiteHat for the comments!

Seems to be a tragedy of the commons issue here. what is the incentive for me to slash a user who is spamming the network ?

My bad! you are right. I forgot to mention that there is a fund locked in the contract which is rewarded to the slashing user.

I guess my preferred solution is to use the p2p network for slashing and pass around the user private key + merkle proof for the updated tree. So slashing happen in p2p network.

By slashing, do you mean just deleting the member or taking his fund as well? if it involves taking the fund, how would that happen in a p2p fashion?

In the case of using a p2p network to propagate the deletion event, what will happen to the smart contract state? do not we need to delete the corresponding key from the contract?

staheri14 commented 3 years ago

If you want to make a on chain root we can do this using a ZKP to confirm every slashing and calculate the new merkle root.

Are you suggesting an optimistic version or performing the zkp verification and root recalculation on-chain? @barryWhiteHat

Another question here is that, what can possibly go wrong, especially security-wise, if no root gets stored on the chain. Given that all the nodes have a local copy of the tree (they can construct the root and their auth path locally), then there should not be any security issue even if we do not store the root on the chain. Would you please let me know your thoughts about this? Thanks! @barryWhiteHat

This would be much cheaper because you would have many slashing in a single tx. Thsi requires someone to store the whole merkle tree.

Does this also mean that we can not kick out spammers right away? their keys can live longer, right?

I am thinking that every user does not need this maybe just power user.

Indeed! not every user has to store the tree, we can have full and light nodes.

Also, there is one more thing that I am not sure about is that if NO one in the system keeps the entire tree, then how can one recalculate the tree root when the auth path of a slashed user is not available? My understanding is that in that case, we need to reconstruct the tree to calculate the slashed user's auth path, is there other ways around this?

barryWhiteHat commented 3 years ago

By slashing, do you mean just deleting the member or taking his fund as well? if it involves taking the fund, how would that happen in a p2p fashion?

No taking funds just delete user.

In the case of using a p2p network to propagate the deletion event, what will happen to the smart contract state? do not we need to delete the corresponding key from the contract?

Nah smart contract does not need to know about this.

Are you suggesting an optimistic version or performing the zkp verification and root recalculation on-chain? @barryWhiteHat

Not optimistic its more like proof of validity rather than fraud proof.

Another question here is that, what can possibly go wrong, especially security-wise, if no root gets stored on the chain. Given that all the nodes have a local copy of the tree (they can construct the root and their auth path locally), then there should not be any security issue even if we do not store the root on the chain. Would you please let me know your thoughts about this? Thanks! @barryWhiteHat

I think that you can remove the requirement for users to hold the tree. And instead have a slashing include a merkle proof of the leave that was slashed and another with that leaf replaced with 0. I think this merkle proof should be enough for the observers to recalculate their auth path so they can continue to make proofs. The new security requirement here is that particion attacks can't happen where i somehow make 1/2 the network use root_1 and the other half use root_2 so that they can't talk to eachother. I think there are some ways to overcome this. Bascially have root1 users listen to root2 message and remove anyone that has been slashed. This should lead to getting to consensus on the calonical root.

Does this also mean that we can not kick out spammers right away? their keys can live longer, right?

Yeah. I would like a hybrid here of p2p slashing and later smart contract slashing to catch up.

Also, there is one more thing that I am not sure about is that if NO one in the system keeps the entire tree, then how can one recalculate the tree root when the auth path of a slashed user is not available? My understanding is that in that case, we need to reconstruct the tree to calculate the slashed user's auth path, is there other ways around this?

I would require a slashing message to include the auth path as well as the private key.

staheri14 commented 3 years ago

I would require a slashing message to include the auth path as well as the private key.

@barryWhiteHat But what if it does not? The slashing user can only recover the sk of the spammer but not the auth path (that has to be calculated). If the spammer refuses to provide the auth path, then someone has to calculate it, right?

By slashing, do you mean just deleting the member or taking his fund as well? if it involves taking the fund, how would that happen in a p2p fashion?

No taking funds just delete user.

Without funds, there will be no reward for users who catch spammers, not that it is a big issue, but wondering how to add the incentive part for slashing.

I think that you can remove the requirement for users to hold the tree. And instead have a slashing include a merkle proof of the leave that was slashed and another with that leaf replaced with 0. I think this merkle proof should be enough for the observers to recalculate their auth path so they can continue to make proofs.

In this proposal, does the contract hold the list of users' public keys (identity commitments) and the root, or only the root?

Does this also mean that we can not kick out spammers right away? their keys can live longer, right?

Yeah. I would like a hybrid here of p2p slashing and later smart contract slashing to catch up.

Wondering why not altering the state of the contract for the slashed key, I mean deleting and replacing the key directly on the state of the contract (assuming the contract holds the list of pks)? are we avoiding deletion to reduce the gas consumption? If we delete directly then we can make sure the spammer won't be able to message again

barryWhiteHat commented 3 years ago

But what if it does not? The slashing user can only recover the sk of the spammer but not the auth path (that has to be calculated). If the spammer refuses to provide the auth path, then someone has to calculate it, right?

The user would get private key. Send to "full node" and get the auth path from there. Then the private key + auth path would get passed around. This is enough for every user to update their witness.

Without funds, there will be no reward for users who catch spammers, not that it is a big issue, but wondering how to add the incentive part for slashing.

Every sign up has 10 eth deposit. If you find someones private key you can withdraw 1/2 of the 10 eth deposit. We only allow them to withdraw 1/2 to avoid an attacker able to slash themselves very quickly to aovid someone else slashing them.

In this proposal, does the contract hold the list of users' public keys (identity commitments) and the root, or only the root?

It depends where the root comes from. If its from interrep it would just be the root and the witness would be gotten by asking the interrep server. If its a smart contract sign up the root and withness would be able to be calculated by the smart contract.

Wondering why not altering the state of the contract for the slashed key, I mean deleting and replacing the key directly on the state of the contract (assuming the contract holds the list of pks)? are we avoiding deletion to reduce the gas consumption? If we delete directly then we can make sure the spammer won't be able to message again

Remove gas cost
Remove delay around slashing
Remove concerns about how big deposit needs to be to incentivize slashing in all cases.

staheri14 commented 3 years ago

The user would get private key. Send to "full node" and get the auth path from there. Then the private key + auth path would get passed around. This is enough for every user to update their witness.

This means the full tree has to be stored somewhere in the network 👍 which is fine and what I was exactly looking for, but I previously asked and thought you are suggesting otherwise. My understanding is that regardless of having or not having the tree root on the state of the contract, the full tree has to be stored by a subset of participants so that the auth path of the slashed keys would be available to slashing peers.

Every sign up has 10 eth deposit. If you find someones private key you can withdraw 1/2 of the 10 eth deposit. We only allow them to withdraw 1/2 to avoid an attacker able to slash themselves very quickly to aovid someone else slashing them.

In your proposal, the fund withdrawal and key deletion won't be at the same time, one is in a p2p manner the other is through a smart contract, right?

Remove gas cost Remove delay around slashing Remove concerns about how big deposit needs to be to incentivize slashing in all cases.

True! 👍

barryWhiteHat commented 3 years ago

In your proposal, the fund withdrawal and key deletion won't be at the same time, one is in a p2p manner the other is through a smart contract, right?

Yes or not have in contract slashing.

I think my broader point is that having on chain slashing is very expensive and difficult so lets hav eoff chain slashing. The anti Sybil for signups can be handled in other ways. Having money be the anti sybil mech seems very annoying.

staheri14 commented 3 years ago

I had a call with Barry and below is the gist of that call: cc: @barryWhiteHat please feel free to edit if there is something missing :)

Keeping Merkle tree root on-chain is not necessary (valid only for the waku-rln-relay case)

Uploading the root to the contract (but not calculating it on the contract) will be problematic where there are two concurrent registrations i.e., two users attempt to insert their pks to the same tree at the same time and will end up with two different tree roots. This happens since the root calculation takes place off-chain where there is no consensus over the order of inserted public keys i.e., each user acts independently and unaware of the other users.

As such, having the tree root uploaded on the contract is not necessary and can even cause issues (one may ask why other applications maintain the Merkle root on the state of the contract, well my answer is that in those cases the transactions ordering and the root calculation is done by a single entity e.g., a block proposer in a blockchain, therefore no concurrency and inconsistency can happen).

The above argument brings us to the conclusion that we can safely remove the tree root from the contract. Instead, just the list of pks would be enough and users can reconstruct the tree root locally.

The full Merkle tree has to be available (either on-chain or off-chain)

The Merkle tree has to be stored somewhere in the system (on or off the chain) to make sure that the auth path of a slashed node is available. Without the auth path of a slashed key, users won't be able to calculate the tree root and update their own auth path.

Off-chain slashing is better that on-chain

On-chain slashing cons:

Subject to delay: the time for which nodes have to wait for the deletion transaction to get through and becomes part of the BC.
The unpredictability of the gas fee at the time of slashing: Slashing needs gas usage on the slasher side, hence the amount of reward paid to the slasher should be high enough to cover the deletion gas cost as well. However, due to the market fluctuation, it would be hard to predict the gas fee for some point in the future therefore the registration fee set today may not cover the slashing cost for the future.

Off-chain slashing pros:

Cost-efficient and simplified slashing model: In the off-chain case, there will be no interaction with the contract for member deletion, therefore we can save gas cost. The idea is that a slasher publishes a deleted pk and the sk over a p2p network. Users which listen to the network will delete the key from their local trees and construct the root. The punishment for the slashed user is that her messages will not be relayed anymore since the entire network has removed that key from their local trees. So, the slashed user has to re-register and repay for messaging.
No race: Another benefit of off-chain slashing is that it copes with the race issue which we would otherwise have to deal with in the on-chain slashing scenario (race can happen in the on-chain slashing if two slashers simultaneously claim the reward for the same key).
Real-time removal of sapmmers: since slashed keys are propagated through a p2p network, users update their tree immediately and won't accept any message from the spammer (needless to wait for the slash transaction to be mined)

oskarth commented 3 years ago

Good discussion! Sanaz and I've also been talking a bit to understand more clearly trade-offs involved here, should be done in a few days.

My main question based on conversation so far is: How would off-chain slashing work? It isn't obvious to me how you'd get the funds. From an incentive POV, it is quite important that a slasher is rewarded for their actions.

barryWhiteHat commented 3 years ago

My main question based on conversation so far is: How would off-chain slashing work? It isn't obvious to me how you'd get the funds. From an incentive POV, it is quite important that a slasher is rewarded for their actions.

If the cost is low enough the incentive for slashing is having to deal with less spam. The idea of offchain slashing is that its sooo cheap that you don't need to give a reward.

vacp2p / research

Performance and Security Analysis of Waku-RLN-Relay (vs other existing solutions) #91

Problem

Possible trade-offs

What we know

Method A: No root on the chain:

What we do not know

On chain root does seem important:

Keeping Merkle tree root on-chain is not necessary (valid only for the waku-rln-relay case)

The full Merkle tree has to be available (either on-chain or off-chain)

Off-chain slashing is better that on-chain

On-chain slashing cons:

Off-chain slashing pros: