barryWhiteHat / semaphore

GNU General Public License v3.0
110 stars 92 forks source link

Comments about generic singaling mechanisim here. #1

Open barryWhiteHat opened 6 years ago

HarryR commented 6 years ago

The generic signalling mechanism allows for the separation of groups and signals.

An Identity may be a member of one or more groups, each group is a Merkle tree. The Identity knows the secret preimage of the leaf of their entry in the merkle tree, their "private key".

A signal is a proof of membership of a specific merkle tree bound to some parameters which make the signature unique for that type of signal.

The signal has a GUID, the GUID is created from a hash of the JSON descriptor of its parameters (see README.md). Each signal is bound to one specific group, changing the group the signal is bound to means there is a new signal GUID etc.

Each member of every group can vote on each signal exactly once, however the signal may have uniqueness constraints, like a database primary key, which mean that more than one instance of a signal may be signed depending on whether or not the unique constraint is violated.

I am working on my fork to turn the initial proof of concept into a full implementation of the README.md.

I am just finishing up work on the initial PoC of https://github.com/HarryR/panautomata before I really dig into this.

omershlo commented 6 years ago

Great work. One point I want to understand better is in regard to the limitations you claim the system has:

There are some limitation here. We only support binary reputation. As in all users have the same reputation in the merkle tree. We could hack together non binary reputation by adding a user multiple times into the Merkle tree. So user 1 would have reputation 1 and user 2 who has two entries in the merkle tree has reputation 2. There is no way to burn reputation. There is no way to risk reputation.

Seems to me that all of those problems are solvable. In identification system we have two phases: Registration and Authentication. In this case the registration is done by the smart contract that assigns public keys to leaves:

A user creates a transaction that sends 1 ETH to the contract which then adds a leaf to the on-chain merkle tree

I think that the registration phase should be defined better and per use case. There is definitely a trade-off between the anonymity set (which at the moment is the full tree) and reputation grouping. i.e. if we work with a few merkle trees where each one corresponds to a different reputation level you reduce the anonymity set by a factor of the number of trees. You can also burn reputation levels by not accepting proofs of specific roots.

The hack you propose such that a user with higher reputation will have more entries in the tree is unclear to me. How the user will prove connection between two leaves without double-spend?

HarryR commented 6 years ago

Seems to me that all of those problems are solvable. In identification system we have two phases: Registration and Authentication. In this case the registration is done by the smart contract that assigns public keys to leaves:

A user creates a transaction that sends 1 ETH to the contract which then adds a leaf to the on-chain merkle tree

That is for the https://github.com/barryWhiteHat/miximus use case - where registration is depositing Ether for mixing, registration in any other case will be different, but the underlying mechanism will be the same - either adding an entry to a merkle tree, or being provided with the root of a merkle tree which contains all of the registrants.

I think that the registration phase should be defined better and per use case.

Yes, the registration phase depends on the use case, but this project will provide the mechanisms for registration regardless of use case.

i.e. if we work with a few merkle trees where each one corresponds to a different reputation level you reduce the anonymity set by a factor of the number of trees. You can also burn reputation levels by not accepting proofs of specific roots.

A way to work with multiple merkle trees would be to have separate 'Vote worth X', 'Vote worth Y' and 'Vote worth Z' trees, where each identity exists in only one tree (well, they can exist in multiple), this does reduce the anonymity set though.

If you create a merkle tree of 'Vote worth X', then have that as a leaf of 'Vote worth Y', which is a leaf of 'Vote worth Z' trees it would be possible for people of 'Vote worth X' to prove a path up to the 'Vote worth Z' root - thus increasing their vote weight - this is bad... 👎

If you think of the classical business of voting shares - the more shares you have the more weight your vote has - e.g. if you have weight 1000 you get 1000 votes and your identity exists 1000 times within the tree, if you have weight 1 you exist once in the tree and get one vote.

The user does not need to prove connection between two leaves, each leaf is a separate and unique secret which represents the right to one vote - if they have multiple they can vote multiply.

Can you think of a better mechanism which doesn't introduce too much complexity and allows arbitrary weights to leaves which doesn't change the zkSNARK circuit or reveal any information about who voted for what and when? e.g. if you know person X owns 40% of the voting rights, and it's a 40/60 split between answer A and B - who did person X vote for? There is a side-channel here where registering all of your votes at once will reveal who you are if people know how many shares you have - which reduces the anonymity set.

However, with a system where each person gets 1 vote - there is no side channel, each signature could be equally anybody else - perfect anonymity.

omershlo commented 6 years ago

@HarryR Thanks for this answer. It seems that you focus only on Voting use case using zk-snarks which indeed can work without proving connection between tree leaves belonging to the same user because every leaf will count as vote. zk-snark Voting system as A LOT of other problems, see discussion here for example: https://github.com/ZcashFoundation/GrantProposals-2018Q2/issues/22.

Voting rights != reputation in general. In my opinion the most classic and easy to implement in real life use case is Permission system. Here reputation means that some users have access rights while other users don't have access rights and some have partial rights. How in this case using 2 leaves per user without connecting them help us? Reducing the anonymity set but creating permission levels (each tree is a level) works but no doubt that there might be other solutions which are better.

To Conclude - I think that we should take the ideas from the doc and apply them per use-case because there are differences and gains if you assume specific use case.

HarryR commented 6 years ago

Reducing the anonymity set but creating permission levels (each tree is a level) works but no doubt that there might be other solutions which are better.

See the JSON signal descriptor in README.md, for example for the anonymous blog post example:

{
    "group": "anon-social-network",
    "external_nullifier": ["date"],
    "vars": {
        "title": "str",
        "body": "str",
        "date": "date"
    }
}

The signal attests membership of a group, restrictions on how often they can emit a signal, and variables required for the signal - this is the general purpose mechanism for signalling. Above I was just trying to provide some of my understanding around the problems of voting / reputation etc.

Can you suggest any other use cases, in JSON description format, which can be added to the list?

pgrzesik commented 6 years ago

Hey, I'm not sure if I understood you article correctly, does the voting example you posted only guarantees anonymity of the voter, but doesn't hide information about the vote and stake itself ? So, it's not possible to link the voter to the specific vote, but the vote itself is public. Is that correct ? Thanks, Piotr

HarryR commented 6 years ago

So, it's not possible to link the voter to the specific vote, but the vote itself is public.

Yes, that's correct.

However, the address you submit the transaction from will still reveal your vote... such is the nature of Ethereum and accounts.

pgrzesik commented 6 years ago

@HarryR Hey, thanks for answer. Yes, I know that the address will make it public anyway, I was just wondering if maybe you used ZKP for proving correctness of the vote, without revealing the vote as a whole.

Thanks Piotr

barryWhiteHat commented 6 years ago

I am unsure we are talking about the same things. Although it is revealed which way everyone voted there is no way to associate any vote with a user if everything is done correctly. Also its possible to hind the address with something like miximus or using transaction abstraction. That is jsut for the on chain case. The off chain case an attacker could do temporal analysis to link a vote to the broadcaster. But I don't know enough about privacy in p2p to discuss that.

Yes you can do this. If we could do recursive snarks (use a snark to verify another snark) we could even build a snark verifier for Semaphore. But the problem with doing this right now is that you cannot do recursive snarks with greater than 80 bit security.

Secondly the prover of this would get a list of all the votes. But you could make a scheme where each voter proves their section of the vote. But here you could do temporal analysis again to find out which way each node voted.

@pgrzesik can i ask you what is your use case ? I think it would just help the discussion if we knew waht you were planing to do and the kind of adversaries you needed to defend against.

burdges commented 6 years ago

I haven't dug into the code yet. Is there a signature happening inside the SNARK using the jubjub trick or something more general?

barryWhiteHat commented 6 years ago

No something more general. Bascially we make our own signature scheme based upon hashing secret info inside the snark so that no one knows what it is but it proves that you know it.

See https://hackmd.io/s/B17_qgrQ7#How-miximus-works

burdges commented 6 years ago

I think I see. You can only use each leaf once in principle, so you avoid revealing it. There are reputation systems should work like currency and only permit themselves to be used once revealing H(leaf ++ "usage"), but then provide a refresh procedure that reveals H(leaf ++ "refresh") and lets you add a new leaf that incorporates the reputation gain or loss from the previous transaction. You're revealing H(leaf ++ c) though where c is some contextual semi-nonce thing.

If you're really reusing reputation, and don't really care about the context or dislike it for being fragile, then you might improve performance using jubjub for a real resuable signature inside the snark, not sure.