MinaFoundation / Core-Grants

25 stars 13 forks source link

RFC-0007: zkPassport #11

Open es92 opened 8 months ago

es92 commented 8 months ago

A remaining question is exactly what new cryptography primitives would need to be implemented in o1js, if any new ones are needed. It looks like RSA, DSA, ECDSA, and Sha-xxx are used (see pages 15, 16, and 17 here[4]).

I've looked around in some other places for confirmation, and this seems to be the full list!

All the tech checks out by my reading, this will be a very exciting addition to the ecosystem.

Nice! Do you know if this is already all in o1js after the recent ECDSA/SHA256 work, or is there more that would need to be in any grant on the cryptography front?

mrmr1993 commented 8 months ago

Do you know if this is already all in o1js

I don't know about the status of the Sha-XXX hashes, but let me ask around. We already have ECDSA, and I've heard about since experiments on the DSA and RSA fronts though, so we're close already!

mitschabaude commented 8 months ago

Just came here to say that this is a fantastic idea that seems technically feasible and extremely promising!

teddyjfpender commented 8 months ago

Now that the Attestation API RFC is out, we can link the two together!

mazito commented 8 months ago

I think this is quite useful and many apps in MINA may benefit from it.

Having a well defined and standard way of doing identity verification is key in many uses cases. Socialcap is one of them, and any credential system may benefit from it also. In many cases you don't even need to access any personal info, just the verification that this is a "unique" person.

I also understand this will be limited to just passports with NFC enabled, and this creates some issues:

So I think this may be a good starting point for a more complete "identity verification", but many border cases should also be acknowledged.

Vitalik's post What do I think about biometric proof of personhood? shows the difficulties associated with identity verification in general (including both biometrics and social proofs).

mazito commented 8 months ago

Continued thinking about this, and what this will actually prove is that "someone has a given NFC enabled passport in his/her hands".

It does not prove that that person "holding" the passport in his/her hands is the real owner of the passport. The passport could have been stolen or even belong to a dead person.

That is why all KYC programs require you to take a photo of the passport and a real time video/photo selfie to assert you are alive (proof of life) and that both are the same.

As I mentioned before "identity verification" is tricky :-)

mazito commented 8 months ago

And just to finish here (sorry for the long comments) this graph from Vitalik's article helps to define the scope of the NFC passport proposal:

image

I would say that it falls in the "Specialized-hardware biometric" category, because in person biometrics are used and verified by the country issuing the passport when the passport is requested.

In fact the passport is a readable paper/electronic proof of that biometrics verification.

And so, for all the recursion fans here, it wil be a proof of a proof.

es92 commented 8 months ago

Thanks @mazito , some thoughts:

mazito commented 8 months ago
  • Agree on social identity - I think it would be more future work extending this to use some kind of social graph (that would have to be well more advanced than the one here.... but seems plausible and hopefully passports lead to a good starting point)
  • Agree on this being proof of holding a passport too - whoever is implementing a production version of this would probably want to take this into account, perhaps by enabling users to deprecate their old passport if they get a new one - I worry that the including a photo / video stream may not work in the future too with generative AI video getting better, so just having a path for users to replace their old credentials seems potentially more robust. It seems likely there would be other considerations too in practice for a production version.

Is specialized hardware things like worldcoin though? For that I'd agree, maybe this is slightly different though, since anyone can get a passport and scan it? I'd argue that its Privacy High, Accessibility / Scalability High, Robustness of decentralization Fairly Low (depending how much you trust governments issuing passports.... which potentially could be better than a company-run solution), and security against "fake people" probably medium, unless replacing your passport is considered easy than High... Maybe there would need to be a new column, for something like "Govt issued digital ID"?

Pfed-prog commented 8 months ago

Also the option of revocation (for some reason such as someone detecting a fake passport, which can create a request to evaluate it and eventually revoke it). And this is in contrast with the desired "no censorship" feature in blockchains. So this is a complicated issue too :-)

What if there are mutliple issuers attesting the same information, similar to having multiple passports.

Verifiable Credentials Data Model v1.1 image

EmrePiconbello commented 8 months ago

I just want to go ever few points and share our knowledge on certain aspects of the topic here.

There is no way to verify an NFC actually being real on its own. It can be easily faked. That's why the entire online KYC/AML process relies on facial recognition to match the OCR-extracted data from the passport.

To tackle this issue, there are a few ML/AI-based software solutions and a few new upcoming challengers which specialize in these kinds of documents and require human presence for verification. There are reputable third-party auditors for these software solutions. For example, ibeta.com conducts tests on accuracy, penetration, etc., from face scans to fingerprint scanners, which governments use in border controls and embed in passports.

The digital ID space has been booming in recent years, with many governments exploring experimental solutions that prioritize privacy. I can provide a few examples of this trend. In the UK, digitalidconnect.com offers a standard framework for digital ID. Meanwhile, oesterreich.gv.at provides a direct digital ID from the government, phasing out plastic IDs. I believe that in the next 5-10 years, we will see a full transition towards digital IDs, which will be much easier to integrate with ZKpassport.

For special hardware biometrics, the only affordable solution would be fingerprint scanners. To my knowledge, there is only one company that creates portable/small fingerprint scanners usable in this manner, which could still cost $40-$100 each, plus logistical challenges and distribution costs.

We believe the biggest issue is the duplication of identities. Because the space is growing very fast, everyone is trying different approaches. For example, digitalidconnect is a standard only for the UK. Even if we establish global standards, I don't think the isolated structure of these solutions will change because each solution accesses specialized data from local government databases, etc., for verification. I don't see governments sharing all this data in a global system as sensible.

mitschabaude commented 8 months ago

There is no way to verify an NFC actually being real on its own. It can be easily faked

@EmrePiconbello according to the RFC, passport data is signed by the host country. we would verify that signature in the proof. you wouldn't be able to fake that.

EmrePiconbello commented 8 months ago

@EmrePiconbello according to the RFC, passport data is signed by the host country. we would verify that signature in the proof. you wouldn't be able to fake that.

@mitschabaude I am not saying that NFC data is forged. Issue is NFC is very easily cloneable and the verification signature is part of it. While you can't alter the data(data can be outdated etc. those are still the downsides). On it's own nfc part of very easily duplicated. You just need to close enough with nfc device for few second. Implementing that would be very wrong in my opinion just because of that reason.

RaidasGrisk commented 8 months ago

NFC is very easily cloneable and the verification signature is part of it.

Besides reading the data from the NFC chip, seems like it is also possible to verify if the data and chip are genuine and not cloned or faked.

https://www.inverid.com/blog/cloning-detection-epassports https://secureidentityalliance.org/ressources/blog/secure-chips-trust-in-passports-what-is-pki

es92 commented 8 months ago

@EmrePiconbello according to the RFC, passport data is signed by the host country. we would verify that signature in the proof. you wouldn't be able to fake that.

@mitschabaude I am not saying that NFC data is forged. Issue is NFC is very easily cloneable and the verification signature is part of it. While you can't alter the data(data can be outdated etc. those are still the downsides). On it's own nfc part of very easily duplicated. You just need to close enough with nfc device for few second. Implementing that would be very wrong in my opinion just because of that reason.

Thanks on this - just added a section commenting on exactly what is being proved here and on security.

EmrePiconbello commented 8 months ago

NFC is very easily cloneable and the verification signature is part of it.

Besides reading the data from the NFC chip, seems like it is also possible to verify if the data and chip are genuine and not cloned or faked.

https://www.inverid.com/blog/cloning-detection-epassports https://secureidentityalliance.org/ressources/blog/secure-chips-trust-in-passports-what-is-pki

We did research on this over a year. Unfortunately I don't have like resources on me right now but with quick google I find this. https://eprint.iacr.org/2005/095.pdf There was a more recent paper with a lot of details but I couldn't find it but I have this https://whenderson.dev/blog/biometric-passports/ The implementation and standards are still very similar. There are few new standards coming up with active writable chips but they are still far from being utilized and biggest issues is comes from that since implementations are all around the place even though there is a standard.

I like to summarize what I know concisely. We have many methods. Chip Authentication, is secure can still be compromised if a chip is extracted from a legitimate passport and placed into a fake one not to mention it doesn't protect from cloning. However, integrating the chip brings complexity, and many countries, including the majority of the EU, do not utilize it. Not to mention chip doesn't block reading the data. Since there is no standard and it's all around the place with suggested not mandatory implementations. Even when better standard is implemented support for older ones existing just makes it insecure.

Additionally, we have a system that verifies these different types of authentication systems. However, due to the lack of uniformity and insufficient security measures behind these methods, as evidenced by extensive research with companies developing hardware and software, along with third-party auditor firms testing these systems, we have concluded that matching all elements provides the highest level of certainty the person and document present at that time. This matching aligns with legal standards for verification. Our focus remains on biometric authentication solutions, as there are few products we wish to explore in this area. We are currently looking in to portable fingerprint scanners viability.

While we are not experts in this space, considering the global usage of the same few software and hardware solutions utilized in law enforcement, border controls, and banking, despite differing approaches, instills confidence.

lampardlamps commented 7 months ago

@es92 cc @teddyjfpender and @mitschabaude there have been quite a few really good zkp ID proposals in zkIgnite 3, see, for examples:

https://zkignite.minaprotocol.com/zkignite/zkapp-cohort-3/feedbackandrefinement/suggestion/652 https://zkignite.minaprotocol.com/zkignite/zkapp-cohort-3/feedbackandrefinement/suggestion/739 https://zkignite.minaprotocol.com/zkignite/zkapp-cohort-3/feedbackandrefinement/suggestion/741

Each of them have sound and unique implementation plans, but they all touch upon the interactions with passport. I'm sure etonec is similar too.

To fully take advantages of the talent and make them pull in the same direction, I personally think it might be very beneficial if Mina Foundation could form a working group on zkp ID, regardless of the funding outcomes of the zkIgnite proposals, so that these teams can participate in the discussion about zkPassport and expand its horizon on potential applications.

KimlikDAO-bot commented 7 months ago

@EmrePiconbello raises a point regarding cloning. Besides this, people give their passports to others for all sorts of legit reasons (getting a visa, checking into a flight etc). Passports get misplaced or stolen all the time.

There is another, more subtle and more severe issue in the proof-of-uniqueness proposal outlined here. The proof-of-uniqueness secret has to be computed from some constant value such as the user's government unique ID. Whatever computation the user does in isolation can be replicated by a motivated actor to link all on-chain traces to people's government IDs. Note that even the nullifiers are deterministic functions of the proof-of-uniqueness secret, so the motivated actor can obtain the user's government ID just by looking at the nullifiers. (note government unique ID search space is tiny, even if it weren't guess-and-check attacks make this a no-go)

Fortunately, one can engage in a trust minimized zero-knowledge protocol, which achieves info-theoretic unlinkability between the user's IRL ID and the proof-of-uniqueness secret. See this (for now; we'll make a public doc as soon as we can) https://zkignite.minaprotocol.com/zkignite/zkapp-cohort-3/feedbackandrefinement/suggestion/739/detail

RaidasGrisk commented 7 months ago

@es92 cc @teddyjfpender and @mitschabaude there have been quite a few really good zkp ID proposals in zkIgnite 3, see, for examples:

https://zkignite.minaprotocol.com/zkignite/zkapp-cohort-3/feedbackandrefinement/suggestion/652 https://zkignite.minaprotocol.com/zkignite/zkapp-cohort-3/feedbackandrefinement/suggestion/739 https://zkignite.minaprotocol.com/zkignite/zkapp-cohort-3/feedbackandrefinement/suggestion/741

Each of them have sound and unique implementation plans, but they all touch upon the interactions with passport. I'm sure etonec is similar too.

To fully take advantages of the talent and make them pull in the same direction, I personally think it might be very beneficial if Mina Foundation could form a working group on zkp ID, regardless of the funding outcomes of the zkIgnite proposals, so that these teams can participate in the discussion about zkPassport and expand its horizon on potential applications.

Just to add to the list, we've developed a fully functional zkp ID app as part of zkIgnite 2, called id-mask. Currently, the app's reach is limited geographically because we've only integrated it with a single local KYC personal data provider. Eager to see how can we leverage passport-NFC interaction.

Moreover, I believe there are some unexplored issues that every zkp ID app will encounter later on. We briefly discussed this with @EmrePiconbello on Telegram. How can we design systems that prevent others from sharing proofs? For instance, once a zk-proof is created using passport-NFC, how do we ensure that only the original creator of the zk-proof can utilize it, preventing unauthorized sharing?

EmrePiconbello commented 7 months ago

Thanks for the suggestion @lampardlamps on that note I like to add this.

Because we have been conducting our research and planning for a very long time, I mentioned something similar to this in our proposal since zkIgnite 1 or 2. Now that we are preparing to launch the product, I want to discuss the details and establish some kind of standard, as our aim is to link all ID solutions to a single ID. Recently, @RaidasGrisk was very helpful with his insights on the matter. He developed ID-Mask, and I believe we have found a middle ground with a structure like hashed (name, surname, personal number) as public output. There is a lot of variance, which could be a limitation for some solutions, as not all solutions might include a personal number. I firmly believe we need these discussions to establish some kind of standard/best practice so that future interactions between these zkApps would be smoother. Otherwise, we will end up with many solutions that are disconnected from each other or cannot work with other zkApps.

es92 commented 7 months ago

@RaidasGrisk

Moreover, I believe there are some unexplored issues that every zkp ID app will encounter later on. We briefly discussed this with @EmrePiconbello on Telegram. How can we design systems that prevent others from sharing proofs? For instance, once a zk-proof is created using passport-NFC, how do we ensure that only the original creator of the zk-proof can utilize it, preventing unauthorized sharing?

This should be covered by the Attestation API, to make sure it doesn't leave the wallet of the user who has created the attestation. This would ensure browser pages cannot access the original credential, and only get context-specific proofs.

es92 commented 7 months ago

Agree & that definiteley makes sense re some kind of working group btw, I can check in if there's bandwidth to organize it or if it would need to be more adhoc, lms

Pfed-prog commented 7 months ago

Moreover, I believe there are some unexplored issues that every zkp ID app will encounter later on. We briefly discussed this with @EmrePiconbello on Telegram. How can we design systems that prevent others from sharing proofs? For instance, once a zk-proof is created using passport-NFC, how do we ensure that only the original creator of the zk-proof can utilize it, preventing unauthorized sharing?

You can ensure that the sender is the owner by implementing a function with struct that checks who sent the message and whether the struct owner is the same as the sender

code from https://www.npmjs.com/package/pin-mina and

https://github.com/PinSaveDAO/PinSave/blob/503300a6d6395c1478ed4f9fe6c02a2c224e382c/packages/mina/src/NFTsMapContract.ts#L66-L71

  @method initNft(item: Nft, keyWitness: MerkleMapWitness) {
    let initedAmount = this.totalInited.getAndRequireEquals();
    initedAmount.assertLessThanOrEqual(this.maxSupply);

    const sender = this.sender;
    sender.assertEquals(item.owner);
...
}

the struct

https://github.com/PinSaveDAO/PinSave/blob/503300a6d6395c1478ed4f9fe6c02a2c224e382c/packages/mina/src/components/NFT.ts#L12-L22

export class Nft extends Struct({
  name: Field,
  description: Field,
  id: Field,
  cid: Field,
  owner: PublicKey,
}) {
  changeOwner(newAddress: PublicKey) {
    this.owner = newAddress;
  }
}
RaidasGrisk commented 7 months ago

This should be covered by the Attestation API, to make sure it doesn't leave the wallet of the user who has created the attestation.

You can ensure that the sender is the owner by implementing a function with struct that checks who sent the message and whether the struct owner is the same as the sender

Do these solutions assume that a public address is linked to single IRL identity?

Let's think about proof-of-adulthood again. Two people work together to cheat the system: one is an adult, the other is underage. The adult uses their passport (or other means of passing private data) to create proof-of-adulthood. After making the proof, they link it to a public address (ignoring how it's done). Then they give the underage person the private key for that address. This tricks the system, right?

Is this a big problem with the system? The way we currently check for adulthood can also be cheated in similar ways. But if we aim for adoption and if more people start using the new system, they'll look for security problems. Would the government approve of this system if it has such flaws? I'm not sure of a good solution, just want to point out the problems so we can find good solutions.

mitschabaude commented 7 months ago

Do these solutions assume that a public address is linked to single IRL identity?

@RaidasGrisk Evan's architecture outlined here, and also the system of Worldcoin which is similar, ensures that every IRL identity can only be linked to a single public key. (That property is needed to make it a proof of uniqueness!)

So, yes, an adult can create a unique ID and then let their child use their computer (absolutely unavoidable feature of every digital system). But no, they can't create multiple wallets all "proved to be owned by an adult" and then hand them out. So, AFAIU, the issue you're concerned with is avoidable.

Btw, re ensuring in a zkApp that a user owns a certain address: It's easy - you just need to create an account update for that address and require a signature on it.

let update = AccountUpdate.create(userAddress);
update.requireSignature();

@Pfed-prog the solution you posted is insecure, because this.sender is not proved to be the actual sender of the transaction: https://docs.minaprotocol.com/zkapps/o1js-reference/classes/SmartContract#sender

EmrePiconbello commented 7 months ago

@EmrePiconbello raises a valid point regarding cloning. Besides this, people give their passports to others for all sorts of legit reasons (getting a visa, checking into a flight etc). Passports get misplaced or stolen all the time.

There is another, more subtle and more severe issue in the proof-of-uniqueness proposal outlined here. The proof-of-uniqueness secret has to be computed from some constant value such as the user's government unique ID. Whatever computation the user does in isolation can be replicated by a motivated actor to link all on-chain traces to people's government IDs. Note that even the nullifiers are deterministic functions of the proof-of-uniqueness secret, so the motivated actor can obtain the user's government ID just by looking at the nullifiers. (note government unique ID search space is tiny, even if it weren't guess-and-check attacks make this a no-go)

Fortunately, one can engage in a trust minimized zero-knowledge protocol, which achieves info-theoretic unlinkability between the user's IRL ID and the proof-of-uniqueness secret. See this (for now; we'll make a public doc as soon as we can) https://zkignite.minaprotocol.com/zkignite/zkapp-cohort-3/feedbackandrefinement/suggestion/739/detail

How do you achieve uniqueness here I don't get it.

mitschabaude commented 7 months ago

Private zkPassport registration

How do you achieve uniqueness here I don't get it.

I looked at @KimlikDAO-bot's proposal and wanted to highlight it because it shows a way to avoid the privacy concerns of registering with a potentially brute-forceable passport hash which has to be sent to the chain in the open.

I'm not fully convinced that the protocol as written in the zkIgnite proposal achieves uniqueness/privacy (will comment on the proposal), but the core idea is great and works. I'll describe a slight variation of it here, that I think achieves both. For simplicity I only consider the passport NFC data and no extra liveness check.

The protocol consists of 3 steps, which I first summarize in non-mathematical form:

  1. A pre-registration step where the user posts their passport data in randomized/blinded form, and also prove that it comes from a real passport data with signature.
  2. An encryption assistance step where 2 different members of a public committee each "encrypt" the blinded passport data using a private key that only they know. The private keys of this committee have to be stable. If a user can re-run the same protocol against a new set of committee private keys, it would give them a second human id from the same passport.
  3. A final registration step where the user takes the encrypted passport data and removes their own randomness that they initially added. This yields data which is still encrypted, but no longer randomized: It's deterministically derived from the passport. We can call the result a unique human ID. One passport can only ever give you one ID (TODO: address the fact that a person can get a new passport - they shouldn't get a new ID). Since the ID combines private keys of two different committee members, neither of them can re-derive the ID from a given passport, and so neither of them can brute-force the data (except if they collude).
    The entire computation of step 3 is done in a zkapp method and results in successfully storing your wallet address (or a commitment to it) in a Merkle tree of humans, under the unique human ID as key.

This protocol achieves creating a Merkle tree of unique humans, which can now be used in several ways - either for anonymous use cases like private voting, where you just show that you own one of the addresses in that tree but not which one, and post a nullifier (so that every human can only vote once) -- or even for use cases where you interact with your address in the open and can prove that it is that of a unique human.

The committee only exists to assist with encryption. They are not trusted with any authority, and they have no access to the private passport data. Only by colluding and putting their secret keys together they might be able to brute-force the original passport data, and learn the identity of a user who registered (but not necessarily the address they registered with if we only store a commitment to that). We can increase the number of committee members from 2 to n to gain more confidence that at least 1 of n members will not collude.


Mathematical details

EDIT: I originally posted a flawed version of this protocol. Below, you find the fixed version.

(This is one possible way, I'm sure there are others. This here only relies on provable operations that exist now in o1js)

In short, the unique human ID that is computed in the final step is

$$ id = h(aH(p)) + h(bH(p)) $$

where $p$ is the passport data, $h(\cdot)$ is the Poseidon hash function, $H(\cdot)$ is a hash-to-curve function which maps field elements to a point on the Pallas curve, and $a$ and $b$ are the stable private keys of the encryption committee. Addition and $h(\cdot)$ hashing happens in the Pallas base field, which is Mina's native circuit field.

$id$ as defined above is a deterministic function of $p$, but can't be re-derived from $p$ except by someone who knows both committee private keys $a$ and $b$.

Preregistration. The user calls a zkapp method preregister(), which proves that their passport data $p$ is correct, by verifying their government's signature on it. We compute $P = H(p)$ with Poseidon.hashToCurve().

The user also passes in a random scalar $r$ and we compute $R = r P$, using provable scalar multiplication. The curve point $R$ is posted as an action to the zkapp.

Encryption assistance. This can happen offchain. Recall that the two members of the encryption committee have private keys $a$ and $b$, and assume that their public keys $aG$ and $bG$ are stored as state on the zkapp.

The user sends $R$ (the randomized passport hash) to both members' nodes, and each sends back $R$ multiplied with their private key -- $aR$ and $bR$, respectively.

They also send an offchain proof of this computation, created by a ZkProgram with public input $R$ and public outputs $aR$ and $aG$, and same for b. The user merges the two proofs into one.

Final registration. The user calls a register() zkapp method, which does the following:

KimlikDAO-bot commented 7 months ago

Hi @mitschabaude, thank you for your kind words and thanks for the mention!

I still didn't get a chance yet to digest your post fully. Allow me some more time.

In short, in our solution, we use functional signatures (as opposed to relational signatures, where single digest may have multiple valid signatures) as verifiable hash functions.

The humanID secret is simply the sum of n parts, each of which is a (deterministic) hash of the user government ID number (example: user's SSN for a user from the US). This ensures that it's unique, right?

mitschabaude commented 7 months ago

@KimlikDAO-bot

The humanID secret is simply the sum of n parts, each of which is a (deterministic) hash of the user government ID number (example: user's SSN for a user from the US). This ensures that it's unique, right?

Ok, so here's how I understood your proposal -- the "shares" are the BLS signatures by the nodes after we multiplied by 1/Poseidon(r), so the sum of the shares is something like $aX + bX$ (where $a$ and $b$ are validator private keys, $X$ is the passport hash to a BLS12-381 curve point). I thought that if we know the public keys of the validators i.e. $aG$ and $bG$, then we can brute-force the value of $X$ since thanks to the pairing we can efficiently test whether

$$ (aX + bX, G) \stackrel{?}{=} (X, aG + bG) $$

for any candidate $X$. If this equation is true, then we know we have found the right $X$. So this would destroy privacy.

This can be easily fixed by hashing the $aX$ etc before using it as a public human id! (This is why I added the hashes $h()$, even though its for a non-pairing curve, just to make sure I destroy any mathematical structure).

The second issue is that you don't really describe (or maybe I missed it!) how the user proves to other parties that he has a valid, unique human id. You write

Note these signatures from each color class are interpreted as "verifiable hashes" in that the user can verify that the hash was computed truthfully by performing a signature check with the public key corresponding to each color class. The final HumanIDv2 secret is the sum of all the shares is mod V, the Vesta scalar field size.

Crucially, the BLS12-381 signature verification does not happen inside a circuit, but happens as normal computation, so this protocol should be quite fast.

However, if I'm concerned about uniqueness, it's surely not enough that the user knows that they have a valid unique id. Others must be convinced of this as well! Otherwise the user could just make up a number and say "this is my unique id". That's why in my version I do the whole verification in a circuit. But you write "Crucially, the BLS12-381 signature verification does not happen inside a circuit", so that confused me.

Both of those issues are easy to fix and I really appreciate your input and proposal @KimlikDAO-bot!

EmrePiconbello commented 7 months ago

I want to get some details about brute forcing argument with hashing since I believe there is enough entropy. Let me start with example.

Leyla Shashi Bláha Lilianne Matthew Wragge Nox Charissa Ahmad Cardea Godehard Farrell Fastúlfr Sixte Swango

Here are 5 outputs from a random name generator. Considering all languages and different symbols and them being different codes in Unicode. People can have 2 names or even more. There is no standard for GOV ID numbers either, with some having not just numbers, and their lengths varying greatly. With all these variables, I don't understand how this can be brute-forceable since any hash we see can be from any country with any form of different structure.

Creating this kind of hash's aim was making possible same user can be know to zkapps by different ID providers. If that's not a solution we still need some kind of standard to make this communication possible. So I like hear people ideas on that matter and how we can achieve that.

KimlikDAO-bot commented 7 months ago

Compute the unique human id, id=H(aP)+H(bP).

Great, thank you! Your solution fixes some important issues and simplifies the proposed protocol overall.

Some bells and whistles:

1) Most (if not all) passport chips contain a signer, enabling them to make a verifiable presentation of the data they store. In ePassport terminology, this is called Active Authentication (AA). Let us call the chip private key and the corresponding public key by chip_sk and chip_pk.

For a passport with AA support, the passport signing authority signs P || chip_pk and stores the signature chip_data_sig in the chip. An AA enabled verifier sends a random challenge and ensures that the purported passport is able to sign the challenge with chip_sk.

In our case, before we talk with the encryption helpers, we will send the challenge Poseidon(wallet_address, seed) to the passport chip, which will respond with a signature chip_challenge_sig.

We will send a proof to the encryption helpers attesting

2) For now, the nullifier generation will have to be done by semi-trusted dApp code. It will be even better if the encryption helpers sign off the "share" for use of (a blinding of a) the user's Mina wallet address.

This way even if the dApp code steals the users HumanID secret, they cannot generate new nullifiers with it, but they can detect the user's past actions.

mitschabaude commented 7 months ago

Considering all languages and different symbols and them being different codes in Unicode. People can have 2 names or even more. There is no standard for GOV ID numbers either, with some having not just numbers, and their lengths varying greatly. With all these variables, I don't understand how this can be brute-forceable since any hash we see can be from any country with any form of different structure

IMO we want to prevent even very common names, and passports that have the most standardized GOV ID, from being brute-forceable.

Also, I think it could be an issue to include the name in the hash that forms a unique ID, since names do change at least once in the life of many people. They could get a second "unique" ID after marrying :D

@EmrePiconbello do you know whether parts of the data on each passport are absolutely guaranteed to be unique over the lifetime of a citizen? GOV ID maybe?

If yes, using just this unique part as $p$ in the protocol outlined above could give us a unique ID with very strong properties, which should be highly reusable across applications.

The other, non-unique parts that we also want to associate with a user, like name, could go into the commitments that form the Merkle leafs. (Or could be stored alongside the ID in other ways, depending on the architecture)

KimlikDAO-bot commented 7 months ago

@mitschabaude With your permission, I'll update our proposal to fix all the issues in the protocol and to give you due credit.

Btw, preregister() is not an on-chain thing right? We should be able to compute the humanID completely off-chain, but talking with the helper nodes. We need an on-chain tx only when presenting a context specific nullifier of it, e.g., Poseidon(zkapp_address, human_id_secret) along with a proof. I may be missing something though.

Considering all languages and different symbols and them being different codes in Unicode. I don't understand how this can be brute-forceable

Such a hash gives anyone a lookup table: if you know someones name and id, you can lookup their Mina wallet address if they have used the zkPassport app (or ~50 likely Mina addresses if the pre registration is used.)

We don't need to try all possible unicode characters. Even starting from a simple names dictionary one should be able to brute force all of it in minutes.

If you have a list of names / government IDs in front of you, like a state actor or a large company would have, no brute forcing is needed.

mitschabaude commented 7 months ago

@mitschabaude With your permission, I'll update our proposal to fix all the issues in the protocol and to give you due credit.

Sure, thanks!

Btw, preregister() is not an on-chain thing right? We should be able to compute the humanID completely off-chain, but talking with the helper nodes.

The idea for preregistering was to avoid the problem with getting frontrun, i.e. as you're trying to register your humanID (sending it in the open), someone else does so before you can. I was thinking about the specific context of registering into a Merkle tree of confirmed unique humans, which I thought was useful.

We need an on-chain tx only when presenting a context specific nullifier of it, e.g., Poseidon(zkapp_address, human_id_secret) along with a proof. I may be missing something though.

At first glance this sounds like it could also work.

Although I see potential downsides with not recording that stuff onchain:

KimlikDAO-bot commented 7 months ago
  • Furthermore, you require that everyone who wants to be convinced of your unique ID always has to verify a zk proof. In particular, other zkapps which want to use it have to recursively verify the proof. This causes them to use a lot of constraints and makes their proofs bigger, which is avoided if verifying your ID is just a Merkle lookup

Would something like this be possible: The communication with the encryption helpers happens using Vesta points. I generate the proof of correct derivation of the human_id_secret only once and then store this proof off-chain (for instance inside the KimlikDAO Pass)

When generating the human id nullifier, my circuit would prove that there exists a proof of correctness for the humanID and that the nullifier is computed truthfully using the correct human_id.

The exact details above may be inaccurate, but can't we efficiently prove the existence of a correct proof and then prove other things such as correct nullifier computation?

mitschabaude commented 7 months ago

When generating the human id nullifier, my circuit would prove that there exists a proof of correctness for the humanID and that the nullifier is computed truthfully using the correct human_id.

The exact details above may be inaccurate, but can't we efficiently prove the existence of a correct proof and then prove other things such as correct nullifier computation?

We can do that, but again, both the proof of correctness of the humanID and especially the recursive proof verification take quite a lot of constraints that would make everything that uses humanIDs heavier than necessary

EmrePiconbello commented 7 months ago

IMO we want to prevent even very common names, and passports that have the most standardized GOV ID, from being brute-forceable.

GOV ID is mostly a 11 number. Since there is no standard there might be regions with less and that can be a problem even if we are combining it name and surname prehash.

Also, I think it could be an issue to include the name in the hash that forms a unique ID, since names do change at least once in the life of many people. They could get a second "unique" ID after marrying :D

Yes it can change. Considering everything like these with about a year of research we come up with this proposal to cover everything. https://zkignite.minaprotocol.com/zkignite/zkapp-cohort-3/feedbackandrefinement/suggestion/652/discussions uniqueness actually comes from ml extracted data from face being hashed.

@EmrePiconbello do you know whether parts of the data on each passport are absolutely guaranteed to be unique over the lifetime of a citizen? GOV ID maybe?

GOV ID doesn't change but not all passports have gov ID in it :) I don't think we can achieve it from any government document consistently.

In another note. The hashing part of name and gov ID is actually match the users id's we can avoid them putting as a public output if we can't find a way to achieve this safely. The reason we look for something like that is. We plan our platform as a bridge for all identities so when pass3 user said this is my idmask user I want to link them. We need to run some kind of check to be sure these two identities at some level for same person. The putting it as public output was generally make it more user friendly for zkapp developers so they can know who is unique or not between id protocols. Considering what we discuss this doesn't likely but still we can have something like that where id protocols can link with other id protocols on client side.

mitschabaude commented 7 months ago

GOV ID doesn't change but not all passports have gov ID in it :) I don't think we can achieve it from any government document consistently.

@EmrePiconbello thanks for your insights and research. It seems to me that the best solution, in a first iteration, might be to restrict the unique proof of personhood to people who have such a unique government id available from a signed document.

uniqueness actually comes from ml extracted data from face being hashed.

That doesn't sound like something you can do in a zk proof though, or that can be trusted by a smart contract and used in a permissionless way.

I'm looking at your architecture diagram https://www.mermaidchart.com/raw/4861da3a-9370-417a-b3fb-861f585f086f?theme=light&version=v0.1&format=svg

There's no detail about the role Mina or zkApps play in it - just two steps "create proof of user" and "validate proofs". What is the content of those proofs?

Proofs are created by the Pass3 backend, with the input of a face scan which was confirmed by the facetec SDK to be a unique human. It seems that the facetec SDK is able to tell you about uniqueness by asking their database about previous face scans. See facetec.com:

For ongoing user authentication, FaceTec’s 3D face matching compares a new Liveness-proven 3D FaceMap with the user’s previously-stored 3D FaceMap

Obviously there's no way to put this interaction with facetec inside a zk proof. You can prove pure computations, but not network calls or lookups in a traditional db

EmrePiconbello commented 7 months ago

@mitschabaude we are open to any solution and like to pivot if we can have a conviction on a path there.

@EmrePiconbello thanks for your insights and research. It seems to me that the best solution, in a first iteration, might be to restrict the unique proof of personhood to people who have such a unique government id available from a signed document.

Extensive research needs to be done to single out the weak ones if this approach is utilised, which would limit the user base drastically. For some of the “unique” ones, there is an essential algorithm involved to generate, such as one digit being specifically one number according to the gender and such, hence it reduces the entropy drastically and opens the possibility of brute forcing.

That doesn't sound like something you can do in a zk proof though, or that can be trusted by a smart contract and used in a permissionless way.

There's no detail about the role Mina or zkApps play in it - just two steps "create proof of user" and "validate proofs". What is the content of those proofs?

The aim here is keeping the shared and stored data at minimal. For low risk authentication, utilising recursive proof with simple data verification(such as some basic info gathered via NFC scan) is the main part where Mina comes to the picture. Yet, due to the uncertainty of our current design in terms of public and private fields on chain due to having no certain standardisation, further examination is required to ensure the correct way of implementing the flow with least possibility of forge, as well as with least public information.

Proofs are created by the Pass3 backend, with the input of a face scan which was confirmed by the facetec SDK to be a unique human. It seems that the facetec SDK is able to tell you about uniqueness by asking their database about previous face scans. See facetec.com:

Proofs are generated by the user side. We have the software, and we have full control. In the proposed approach, we match three elements from the user: passport data from NFC, passport data from OCR, and matching face data with the face obtained from OCR and NFC. Here, the SDK does the work on the client side. The server verifies the liveness and returns a positive or negative result. The proof is generated according to this response on the client side.

For ongoing user authentication, FaceTec’s 3D face matching compares a new Liveness-proven 3D FaceMap with the user’s previously-stored 3D FaceMap

Obviously there's no way to put this interaction with facetec inside a zk proof. You can prove pure computations, but not network calls or lookups in a traditional db

Yes, unfortunately, the machine learning model is very resource intensive, making it unsuitable for running on a phone. We are not planning to utilise continuous face scan authentication. The current plan is to store the hash result from the scan on a data availability solution that we can utilise.

  • so, what concretely prevents me from creating the same "proof" for 100 of my identities, by just inventing "unique hashes" and without running my face through the facetec software?

It's not just about the face, as I shared above; all the methods need to match. You can't invent unique hashes without performing these steps. Additionally, there are only a few organisations conducting audits/benchmarks on these solutions, and they measure FRR (false rejection rate) and FAR (false acceptance rate). According to these metrics, they rank among the top performers, with rates as low as 0.0000008% in some cases.

We cannot tamper with the software SDK (I believe we can utilise the SDK inside proof) and server-side function simultaneously. Therefore, we cannot push a random hash from the server side as long as the SDK is utilised on the frontend, which will be public.

The plan involves storing the hashes on the chain, but due to limitations with Mina, we are still uncertain about how to approach this. We want the hash storage to be public if it cannot be verified with zkApp. We are still considering many options, such as verifying age without uniqueness or verifying specific data aspects, since it's customizable.

  • is Pass3 a trusted actor in this system? can only they create unique identities, and do we have to trust them to not create duplicate identities?

Pass3 is our branding; we utilise the technology from FaceTec. In every existing solution, you need to place trust in one of the many providers around the globe. They all undergo audits and benchmarks from organisations like NIST.gov, iBeta.com, etc., and comply with many standards. The issue is that you always have to give your data to a third party for compliance. For example, https://www.jumio.com/kyx/ has a direct plugin for AML screening DB provider, which performs the checks. However, for these checks, they need to share your data, or after KYC, they need to retain your data for 5-10 years, depending on regional laws.

There are only a few solutions that allow customers to build a custom solution. That's why we are not offering any compliance and legal aspects; we are simply providing verification. For each verification, we require all the steps again (because we do not plan to store any identifiable data). However, after completing a full ID enrollment, something like a signature from a private key combined with proof generated from NFC data could be sufficient to authenticate low-risk cases. Our long-term plan is to run it in a hardened confidential VM environment, where it is completely isolated. Additionally, we will strive to make it as open as possible or have 3rd party audits and certifications.

The proposal is kept simple and concise because when it comes to these details, it quickly becomes out of scope (due to the delivery timeline of three months) and complicated. Also, some decisions about the flow will be made based on the viability of zkApp and ProtoKit. Since we are not 100% familiar with the platform we are building, we do not want to create detailed flows at the start and end up delivering something completely different due to limitations.

In summary, our goal is to deliver the most private and unique ID solution we can on the platform, solving ID solutions interoperability issue. As a final note, as outlined in the proposal, many governments are currently exploring digital ID solutions, with some testing or migrating towards them. These digital IDs are mostly developed with privacy concerns in mind. Since they are directly from governments, they are also compliant. In the very long term, what we envision is that instead of KYC providers like Onfido, Sumsub, and many others collecting vast amounts of user data and relying on these solutions for compliance, they would store proof of digital IDs.

mitschabaude commented 7 months ago

We cannot tamper with the software SDK (I believe we can utilise the SDK inside proof)

You can't use the facetec SDK inside a proof

The plan involves storing the hashes on the chain, but due to limitations with Mina, we are still uncertain about how to approach this.

Before worrying about how to store hashes on chain I really recommend to think about how to prove that those hashes are not just random numbers. From my perspective, that's a core part you need to get right and not a detail to be figured out eventually.

EmrePiconbello commented 7 months ago

You can't use the facetec SDK inside a proof

I should be more clear. They have a sdk which some part of it is OCR functions so idea here is utilizing them. One more thing we have as a options is building simple OCR which can be run in proofing. As long as first onboarding combining all factors of OCR-NFC-Liveness the authentication mostly will rely on the linked wallet and occasional checks which is not all elements but for small data match from NFC or OCR.

Before worrying about how to store hashes on chain I really recommend to think about how to prove that those hashes are not just random numbers. From my perspective, that's a core part you need to get right and not a detail to be figured out eventually.

Here, I am not sure, so I'll answer from two angles.

When we talk about our backend just throwing random hashes to the front, and since our backend can't be run inside a proof, there are a few ignited proposals around ML and AI. One of them revolves around proving computation, so if something like that is possible, we'd like to explore it. However, in the current state, we don't have a way of running these computations inside snarkyjs.

Also, I don't think it's a very realistic expectation because with all ID solutions and everything, you get verification from somewhere. It could be a government endpoint or a KYC provider, etc. You don't run their software in snarkyjs; you just build an oracle which verifies the data from them. This is what our proposal is at a very basic level. Where we differentiate is we have control, so we can adjust it according to developments. We can extract the data and verify it on the client-side without sharing any data on our end, but that brings a lot of issues with fake IDs. Here's a recent article https://x.com/josephfcox/status/1754514949995384996?t=PO5Nnvn4IzCRv4LS6I-LJA&s=09 where OKX, which looks like it was not doing enough for KYC/AML, is mentioned. They are using Jumio, which I mentioned above, and they have a liveness check, AML, everything baked in, but OKX prefers not to utilize them (I am speculating at this point since I can't know exact details, but this mostly happens because however all these solutions claim easy onboarding and everything, people mostly do not comply with following instructions or try to do the process in very suboptimal conditions with a dirty camera lens, a lot of glare, etc., so instead of losing the customer at onboarding, they do this so they can onboard users without friction). The reason we are taking all these steps is to ensure fake or cloned documents are not getting in.

When we talk about the hash, they are not just random numbers; they are the result of user data. It's not in our core plan to utilize something like that, but we believe it is required as a standard. So, at least some part of the ID solutions would have certain aspects that can be interoperable and utilized by zkapps. Since this hash is derived from the user data which exists on the user's client where computation and proof generation happen.