<new proposal> Extending WebAuthn Protocol for Remote Authentication

thedreamwork commented 3 years ago

Last updated: Match 15 Modern identification techniques ask users to scan their ID documentation or do face verification authentication in order to be verified by an online service. Such techniques request from users to capture a photo through the camera of their mobile device. Data authenticity and fidelity is crucial for user identification techniques. Therefore, it is vital to address feasible attacks.

Here we will concentrate on one such attack — the injection attack. Such attacks are easy to perform, devastatingly successful, and represent a significant barrier to the widespread deployment of authentication systems that use consumer smart devices in web scenario. Instead of attacking the system in front of the capture device (e.g., the webcam on a smart device), the injection attack operates by copying a digital representation of a real biometric signal, and then injecting it into the system at some later date and/or different location. Since the copied signal is bit-for-bit an exact replica of a legitimate signal, it can completely bypass the security checks, such as liveness and anti-spoofing.

One of the most important goal is the ability of the remote service to verify the authenticity of the received data. In our proposal, remote attestation utilizes a modified Webauthn protocol to facilitate a public-key signatures between the client and the server, means that you have the private key and you share a public key with the verifier of the signature. When a photo has been captured, it computes and signs the hash of the photo with the private key. When the user submits the photo, the server computes the hash of the photo with the same hash function as the client side and using the public key for the corresponding user verifies that the signed hash matches the hash of the received photo. If this is successful, then the uploaded photo has not been tampered with, otherwise it is potentially dangerous to trust the client with camera data.

Remote Attest is actually an extension to WebauthN API. Image Attest is one of Remote Attest, an API for protecting media authenticity. In the nutshell it is an interface to talk to client data producer. You give it a challenge. You get safe client data back.

There are two operations: MakeCredential and GenerateAssertion	Credentials Management API	WebAuthentication API	Remote Attest API
navigator.credentials.store	navigator.credentials.create	navigator.credentials.create
navigator.credentials.get	navigator.credentials.get	navigator.credentials.generateAssertion

Using the navigator.credentials.create call to create a cryptographic key on device, and then attest to the key’s validity. This produces an attestation object that website passes to server, along with the corresponding key identifier. Server verifies the attestation object, and then extracts the embedded public key and other information. Later, Server uses the key to verify assertion objects with camera data using navigator.credentials.generateAssertion call.

Here are our updated thoughts on the remote authentication as of early Match. This document should explain most of what we're thinking. Ensuring the Authenticity and Fidelity of Client Data.pdf

sbweeden commented 3 years ago

FYI I know of at least one existing native-mobile-application FIDO2 based solution in this space already.

https://www.nist.gov/ctl/pscr/authim-0

I am not sure how WebAuthn would comprise part of a solution though since WebAuthn is a JavaScript API for browsers (the FIDO client) which do not typically directly control device cameras.

thedreamwork commented 3 years ago

FYI I know of at least one existing native-mobile-application FIDO2 based solution in this space already.

https://www.nist.gov/ctl/pscr/authim-0

I am not sure how WebAuthn would comprise part of a solution though since WebAuthn is a JavaScript API for browsers (the FIDO client) which do not typically directly control device cameras.

In order to solve this problem, it is necessary to define different levels of security. At the browser level, we can only assume that the browser is secure, so this solution can be implemented by simply combining webrtc and webauthn. Using Webrtc API to get the media and secure it with webauthn's key capabilities. At the operating system level, key management can be linked to the camera data. Keys are usually located in secure storage devices, such usually protected by TEE or SE. At the hardware level, Qualcomm now has an off-the-shelf solution. More detail in here.

Firstyear commented 3 years ago

Wouldn't it be better that if we were to provide this kind of api to target it as a generic "blob signing" process? That way it could be used for other purposes like signed/authenticated uploads of various datatypes that may be RP specific beyond just "images" or "identities".

This could be seen as a process where the account/identity has public keys associated via webauthn, and then those keys can be used to validate other uploads that require stricter assurances to authenticity.

thedreamwork commented 3 years ago

Wouldn't it be better that if we were to provide this kind of api to target it as a generic "blob signing" process? That way it could be used for other purposes like signed/authenticated uploads of various datatypes that may be RP specific beyond just "images" or "identities".

This could be seen as a process where the account/identity has public keys associated via webauthn, and then those keys can be used to validate other uploads that require stricter assurances to authenticity.

This is what we want. There are very many business scenarios for remote authentication, including various device sensor information, especially location data, camera data, voice data and etc.

Firstyear commented 3 years ago

your suggestion isn't requesting this though. Your suggestion involves webrtc and many other parts, and is targetting only image capture and signing. You even named it "GenerateImageAssertion". Not "GenerateBinarySignature" or similar.

thedreamwork commented 3 years ago

your suggestion isn't requesting this though. Your suggestion involves webrtc and many other parts, and is targetting only image capture and signing. You even named it "GenerateImageAssertion". Not "GenerateBinarySignature" or similar.

Based on your suggestion, the API name may be modified, including parameter names. But the overall process is the same.

Firstyear commented 3 years ago

if this is going to be a generic binary data signing mechanism then the process would be very similar to get assertion yes. We would need the RP to send a create signed assertion options or similar with a challenge to prevent replays, and the collected client data would be extended to contain a sha256 of the data we want to assert. Otherwise the process is similar to assertion.

It may be better to frame the proposal as signing arbitrary data than using the camera example. It makes it clearer that the proposal can have many more applications.

It's also worth discussing how this creates a chain of trust to the data, but also needs to have discussed the weaknesses of this system.

sbweeden commented 3 years ago

I can't see how this is really going to provide a proof of anything particularly useful unless the camera itself (in the example use case of signing an image) was hardware that included an attest-able FIDO2 authenticator capability and the process of taking and then immediately signing the photograph was therefore "within the authenticator boundary". To suggest the browser can safely broker that transaction is (IMHO) placing too much trust in the relationship the browser has with other peripherals on the device.

sbweeden commented 3 years ago

And before someone says "this same problem exists with the use of FIDO authenticators for authenticating users at websites today", keep in mind the problem being solved. WebAuthn and human-consumable PKI that it facilitates is really designed to address the problems of phishing and subsequent credential stuffing attacks, which I think it does reasonably well.

thedreamwork commented 3 years ago

And before someone says "this same problem exists with the use of FIDO authenticators for authenticating users at websites today", keep in mind the problem being solved. WebAuthn and human-consumable PKI that it facilitates is really designed to address the problems of phishing and subsequent credential stuffing attacks, which I think it does reasonably well.

We are ready to tackle the injection attack issues. In the simplest example, someone hooks the webrtc API and then injects the originally defined media data. This is not too difficult on the website. We have many business scenarios where we need to ensure the authenticity of the client data. More specifically, we need to make sure that the source of the data is from the browser or the operating system, and not externally injected data.

thedreamwork commented 3 years ago

I can't see how this is really going to provide a proof of anything particularly useful unless the camera itself (in the example use case of signing an image) was hardware that included an attest-able FIDO2 authenticator capability and the process of taking and then immediately signing the photograph was therefore "within the authenticator boundary". To suggest the browser can safely broker that transaction is (IMHO) placing too much trust in the relationship the browser has with other peripherals on the device.

Web authentication needs to include both remote authentication and local authentication. The current WebauthN is a set of local authentication protocols. Local authentication needs to ensure that the result is authentic, but remote authentication needs to ensure that the data is authentic. There are similarities, but the threats are different.

emlun commented 3 years ago

I agree with Shane that the image capture and signing would both need to happen within the "authenticator boundary" in order for this to be meaningful. It seems like the proposal could be construed like that, but I think the whole "image attest service" would have to be implemented in a secure enclave - otherwise it's likely that a rooted/jailbroken phone, for example, could easily spoof the attestation and use it to sign arbitrary data. And if it's possible to jailbreak even one phone in this way, that means you could connect that one jailbroken phone to a web service allowing anyone to circumvent any system relying on ImageAttest. I'm personally not familiar enough with secure enclaves to say whether that is feasible.

If we assume that can be solved, though, I think the proposed API could be simplified to fit within the existing extensions framework. I think this could be an authenticator extension something like this:

$$extensionInput //= (
  imageAttest: true
)
$$extensionOutput //= (
  imageAttest: { img: bytes }
)

and the signature is just the ordinary assertion signature, which includes the above extension data in the signed data. This would make it even more clear that the image capture and signing all happens within the authenticator boundary.

The registration procedure probably doesn't need to change at all, other than adding the extension input. This would however mean there is no difference between an ordinary authenticator and an "ImageAttest authenticator" - an "ImageAttest authenticator" would simply be an ordinary authenticator that also supports the imageAttest extension (most likely a platform authenticator integrated in the operating system).

It may be better to frame the proposal as signing arbitrary data than using the camera example. It makes it clearer that the proposal can have many more applications.

I disagree - this would successfully defend against malicious RPs, but the purpose of the solution proposed here (as I understand it) is to defend against malicious users. The distinguishing feature is that the client is the source of the data to be signed, so making it an arbitrary data signing service would make it useless for this purpose.

In fact it is already possible to (ab)use the WebAuthn API for arbitrary data signing: just set challenge = sha256(nonce, documentToSign) for example.

emlun commented 3 years ago

...but on second thought it's of course not a good idea to embed the raw image (likely several megabytes) in the authenticator data. Luckily we could still embed the raw image bytes in the client extension outputs, and just a hash of that in the authenticator extension output. This would probably mean that only platform authenticators can implement the extension in practice, but that would likely be the case anyway. So this should still be possible to implement within the extensions framework.

thedreamwork commented 3 years ago

Many thanks to @emlun. If it is because of security concerns, then it can only be placed in the "authenticator boundary". We have implemented the signing of camera data in TEE environment on an Android device. The camera data is sent directly to the secure enclaves. In addition, Qualcomm now has an off-the-shelf solution. Therefore the external environment is currently well developed. We do believe there are rich scenarios for both remote authentication and local authentication. If you add to an existing extension framework, it will greatly affect its scope of use. We have modified the documentation and interface and would appreciate your further suggestions.

emlun commented 3 years ago

We have implemented the signing of camera data in TEE environment on an Android device. The camera data is sent directly to the secure enclaves.

(Emphasis added)

The point I'd like to make is: are you also collecting the camera data within the TEE boundary? Because the above sounds to me like the camera data is collected outside the TEE, and then sent into the TEE to be signed. But my understanding of the problem here is that you don't trust that the user is honest about where the image comes from. For example, a cryptocurrency exchange might want to prevent users from submitting a passport photo purchased on the dark web.

Assuming that is the case:

If you (the server) trust the end-user and the client, then you do not need this extension. You can just collect the data to be signed, mix it into the challenge and get a normal WebAuthn assertion for that challenge.
If you (the server) do not trust both the end-user and the client, then you cannot trust any code running outside the TEE or your server. By extension, you cannot trust any data originating outside of the TEE or your server. Therefore you must run the whole camera stack - trusted hardware running trusted code - inside the TEE, otherwise your TEE will be signing untrusted data.

If you rely on an untrusted camera application (as opposed to one code-signed and trusted by the TEE) to provide the image data, then the signature will not provide the assurances you seem to want it to.

Please correct me if I misunderstand what problem you want to solve.

equalsJeffH commented 3 years ago

on 2021-03-17 call:
@ve7jtb will followup here and attempt to clarify.

ve7jtb commented 3 years ago

@thedreamwork

I am having a hard time following exactly what you are trying to do.

If the image is captured in an app. Are you trying to use the Fido attestation to identify that the app is legitimate? I can see how at least on Android that might work with the saftyNet attestation. It might not work so well on other platforms.

In the browser, I don't see what it is getting you, other than some sort of integrity that the page has come from some particular origin (more or less) and that a key generated for the origin is signing over some hash generated by the page javaScrypt.

Have you considered doing this with the current specs by inserting the hash as part of the challenge?

Other people have also asked for an extension that could sign over arbitrary data. That could happen with existing CTAP2 authenticators by putting the data in clientData at the WebAuthn layer.

I still see the major flaw in this for browser applications being that tokenBinding has not been addopted so WebAuthn can't provide a end to end encryption guarentee, only a origin one, and even at that ther are xss and other attacks that could load malicious JS into the browser especilly if the uer is complicit.

Perhaps the basic attestation is sufficent for your use case and we are trying to close too many holes.

thedreamwork commented 3 years ago

@emlun So sorry, I may not have expressed myself clearly. We have worked with Qualcomm to implement a demonstration that we obtain camera data within the TEE boundary. In Trusted Execution Environment, we can handle this data, such as signature, encryption, etc. Camera data is stored in TEE memory from the beginning. Therefore, applications in REE cannot modify or access this data. Based on the above behavior, we can guarantee that the image data comes from the camera and is not injected externally. If the attacker cannot send the original image to the remote server for verification, they can only rip the screen or print it out, then detection is possible using AI algorithm. We make this proposal to prevent injection attacks. We want to make sure that the data is real from the user device. Today's devices have an increasing variety of sensor, while the cost of falsifying these sensor data is very low. Here're common attacks: fake location, fake health data, fake voice, fake photos. In many scenarios, we have to determine whether the data from the device is fidelity. From the security perspective, if the browser just collect the data to be signed internally, I guess that also makes sense, in a really sad way. That might like the saftyNet attestation in Android platform. The operating system guarantees that the browser has not been tampered with. Therefore, we need to rely on a system-level service (e.g. apple attestation) to tell us if we are running in a secure environment. For protection against general attackers is helpful. If you want to protect against higher level attacks, as you rightly say, we need to make sure the whole camera stack is secure. Different levels of security can be achieved with different approaches. Finally, we are very sorry for the confusing expressions. General verification requires a registration process. For example, for remote face verification, we need to enroll the original face data in the registration procedure. Therefore, we can perform face matching in the verification process. In the documentation, we only describe how to ensure the authenticity of the client data. Naturally, authenticity of data is the basis of remote face verification. Local authentication emphasizes the authenticity of the results, while remote authentication emphasizes the authenticity of the data. In my opinion it is in fact essentially the same. I don't know if you have any other confusion.

emlun commented 3 years ago

Ok, thanks a lot for clarifying! I'm glad I was mistaken about that, then.

So in that case, like you said, the whole procedure from photo capture to signing happens in the system-level service, which in the WebAuthn model would correspond to the authenticator layer. The system you describe clearly requires support from the operating system and hardware, and there would be no way for a standalone browser to implement the feature without that support from the platform. Therefore it seems to me like this would be most appropriate as an authenticator extension, and doesn't need any changes to the client-level navigator.credentials API. Making it an authenticator extension also reinforces that the whole procedure takes place within the secure "authenticator boundary". This will also prevent injection and replay attacks as long as each request uses a unique challenge.

However, it also seems to me like this is conflating user verification and data signing. The model for biometrics in WebAuthn has so far been that biometric data is never directly exposed to the RP - instead the authenticator simply reports back whether biometric verification was successful. The flow you describe instead sends biometric data back to the RP server. Is that necessary? If the RP can trust the authenticator (TEE), is it not enough for the authenticator to verify the facial scan locally and only report the result as the UV flag?

ve7jtb commented 3 years ago

I agree that this sounds more like a CTAP extension, however, as @emlun states, Fido has a policy that requires biometric template storage and matching on the device. I understand this is intended for remote identity proofing.

I suspect that this probably needs to be a separate API even if there are some similarities with WebAuthn.

John B.

Kieun commented 3 years ago

For the purpose of the identity proofing (or verification), I understand that the captured data should be sent to remote server by ensuring authenticity of the data. This is more like an attestation scheme by inputting some captured biometrics as a to-be-signed data; this might be performed within the TEE entirely. So, this requirement seems to add some APIs or extension to attest capture image (or biometrics). On the other hand, the request seems to be about just for extending user verification from local to remote.

thedreamwork commented 3 years ago

It's clear to me that web authentication needs to cover both local and remote parts. The existing WebauthN protocol seems to contain only the local part. As devices become more and more perceptive, the means of verification are becoming more and more abundant. Remote authentication allows a lot of verification that could only be done offline to be shifted online. In the current situation analysis, we urgently need to address the security risks involved in remote authentication. Ideally it should be a unified specification, as they are essentially the same. However, extending the existing interface seems to make its definition unclear by adding many parameters. Perhaps they could vary by having a custom flow or security risks, different use cases, or threat models and capabilities. In the local authentication scenario, since the data are stored on the device, the same device is required in the registration and verification phases. But with remote authentication, we can register on device A and verify on device B. We just need to make sure that the data is authentic and fidelity, so in my opinion the registration and verification phases can be unified. We just need to make sure that biometric authentication is only as secure as the physical inputs and sensors used to gather it. Local authentication is used to prove that I am I, while remote authentication is mostly used to prove who I am. So there are very many cross-platform authenticators in local authentication, but I think what is needed for remote authentication is platform authenticator. Extending the ctap protocol may be a bit too limited. A new interface may be more appropriate. I do not know if I am making myself clear. And lastly, a very big thank-you to @ve7jtb time and consideration.

thedreamwork commented 3 years ago

@emlun Local authentication used as an alternative to passwords, but now there are many scenarios that go beyond passwords. For example, consider a user who logs on to a system by entering a user ID and password. The system uses the user ID to identify the user. The system authenticates the user at the time of logon by checking that the supplied password is correct. Identification is the ability to identify uniquely a user of a system or an application that is running in the system. Authentication is the ability to prove that a user or application is genuinely who that person or what that application claims to be. After years of work, we have replaced passwords authentication in many scenarios, while many scenarios require confirmation that we are real people. There are many identification applications for remote authentication. Establishing confidence in a user’s identity is paramount for many organizations. This could range from a customer signing up for a new bank account, to a new employee starting a new role working from home to a citizen trying to register for pandemic-related financial support on a government website. In some cases, this confidence is needed to mitigate the risk of financial fraud, in others it’s to prevent a bad actor from accessing your company systems, and in some, it’s a regulatory requirement. The COVID-19 pandemic has accelerated the need for robust identity proofing in digital channels and also prevented many of the in-person interactions where identity proofing has typically taken place. Furthermore, organizations have elevated their digital transformation initiatives, and with that so has the need to know who is really on the other end of that internet connection. For many years, the foundation of online identity proofing has been a data-centric approach. This involves checking the identity data (e.g., name, address, date of birth, social security number) entered by a user against sources such as electoral records, credit bureau data and census information. The identity assurance achieved with this capability used in isolation is relatively low, as there is no assurance that the user entering the data is actually the owner of the identity. More recently, there has been a surge of interest in document-centric identity proofing, more informally known as the “ID + selfie” process. In this process, a user captures an image or video snippet of their photo identity document, which is assessed for signs of tampering or counterfeiting. The photo on the document is then compared with a “selfie” (still photo or short video) taken by the user submitting the document. A critical component when assessing the selfie is presentation attack detection (commonly referred to as “liveness detection”), which confirms the genuine presence of the user. This offers a far higher degree of confidence in the identity and that the owner of the identity is present. In that light, it's not enough for the authenticator to do verification locally. I'm not sure if you're aware of it or not.

Kieun commented 3 years ago

One thought: If the requirement is about to get the captured images while ensuring the authenticity, I don't understand why we need both "create" and "generateAssertion" APIs. This seems that we only need to provide attestation for the data.

ve7jtb commented 3 years ago

Now that I understand the desired goal, it may be clearer to describe this as WebAuthn with Server Side Biometrics as opposed to Remote Authentication.

thedreamwork commented 3 years ago

@Kieun @ve7jtb Thanks a lot for all of your nice advising. After much discussion, we will make a round of modifications to the original proposal. In conjunction with our practice, we then formally submit the Pull Request.

emlun commented 3 years ago

Ok. I do understand that you need the photos to make it back to the server for identity proofing. What I was trying to get at was whether those photos need to be full "biometric signals" that the server verifies as biometrics, or if the biometric matching could be kept on the client side while the server simply receives back a plain photo.

So I'm imagining that the server would not verify any biometrics, but instead trust the client's assertion that "this credential has been registered with facial recognition, and this (plain) photo shows the person registered". Later assertions would then assert that "the previously registered person has passed facial recognition for this credential", and the server can refer back to the original photo for audit logs and such. The plain photo wouldn't be enough on its own for the server to verify liveness, anti-spoofing, etc., but as long as the client (TEE) is trusted the server can trust the client's assurance that liveness- and anti-spoofing checks have been done.

It's a subtle difference, and I don't know if an approach like this would suffice for your applications - but if possible, it might make it easier to make the case that this still respects FIDO principles while still providing most of the capabilities you need (I think).

thedreamwork commented 3 years ago

It is worthy of our serious consideration of how to combine existing protocols. From our current proposal, we are also trying to extend the WebAuthN specifications. We have learned these lessons from it.

The FIDO protocols are designed from the ground up to protect user privacy. The protocols do not provide information that can be used by different online services to collaborate and track a user across the services. Biometric information, if used, never leaves the user’s device. However, the reality is that there are indeed many scenarios that require the user's real identity. Many times we need to know who is really on the other end of that internet connection. This offers a far higher degree of confidence in the identity and that the owner of the identity is present. Our any modifications to the FIDO protocol will change its principles.
FIDO mission is to develop and promote authentication standards that help reduce the world’s over-reliance on passwords. It has proven to be very successful in some uses. Over the years, people have used more secure and convenient alternatives to passwords. Our mission is to make the Internet more authentic and reliable. With the development of the Internet, cyber security risks have also changed significantly. Those counterfeiting, tampering, fraud are extremely common nowadays. We tried to address some of the emerging issues.
And I think remote and local authentication are not confronted with each other and can supply each other. For example, user can go through eKYC step to re-register the new authenticator for account recovery. Naturally, authenticity of data is the basis of eKYC. On the other hand, Once we get an identification through the KYC process, we can use it to do authentication. Just like eID schemes rolled out across Europe.
The computing power of today's devices in a trusted execution environment is still limited. We are not yet able to implement very complex functions in the Secure Environment (e.g. Image Recognition). Many applications will be more convenient for us to handle in the cloud. We need to ensure the authenticity of these data. These already have a very large number of application scenarios.

Thank you very much for @emlun consideration. We are also thinking about how to build a new interface to meet as many needs as possible. Do you have any suggestions or experiences to share?

ve7jtb commented 3 years ago

What is being proposed as remote authentication is a mechanism to do server-side biometrics.

The promise of Fido is to do on device biometrics when biometrics are used.

In discussions with the WG there is concern that allowing remote biometric capture as part of something branded as Fido or WebAuthn would compromise people's confidence in the privacy.

While part of the credential manager API might be used as a method to do remote documentation verification, the feeling of the WebAuthn working group is that this should be kept separate from the WebAuthn specification.

As such the intention is to close this issue.

w3c / webauthn

<new proposal> Extending WebAuthn Protocol for Remote Authentication #1580