Closed anderspitman closed 5 months ago
I'm not so sure about overreaching.
As far as I know, at least European companies are legally obliged (EU GDPR) to record with which 3rd parties personal data of their end-users have been shared. Before sharing the data, it must be ensured that the end-user gave consent. This is what the FedCM popup dialog is doing, but as the dialog is a browser feature, it needs to inform the IdP about the client the data is sent to (and whether the dialog was actually shown) so that the IdP can release the personal data to the 3rd party including writing a record of that transaction.
I'm not so sure about overreaching.
As far as I know, at least European companies are legally obliged (EU GDPR) to record with which 3rd parties personal data of their end-users have been shared. Before sharing the data, it must be ensured that the end-user gave consent. This is what the FedCM popup dialog is doing, but as the dialog is a browser feature, it needs to inform the IdP about the client the data is sent to (and whether the dialog was actually shown) so that the IdP can release the personal data to the 3rd party including writing a record of that transaction.
Ha. I never realized until this moment how similar "overarching" and "overreaching" were. I change the title to be more clear. Sorry for the confusion!
You bring up some good points though. In FedCM, technically it's the browser and not the IdP providing the information to the RP right? The IdP returns a token to the browser and the browser sends it to the RP, based on the user's consent. Assuming you have a token format that could be parsed by the browser, it could verify that only information that the user consented to is provided. Obviously this would require changes to FedCM, including:
It would be possible for a FedCM IdP to accomplish something similar with the current defined endpoints, but would require violating the spec.
This is such an interesting idea.
It could look something like this:
There are a couple of things that comes to my mind:
UPDATE: I think you are right, it is not implementable today: the browser shares the
Origin
in the IdP assertion endpoint, which wouldn't allow you to omit theclientId
parameter.Is there any chance of closing this hole, either by changing the wording of the spec to remove the requirement for the IdP to be able to identify the RP, or (even better) by changing the endpoints to not even use client_id?
Maybe we could make client_id
optional, and skip the client_metadata_endpoint
request.
I agree the violation is minimal on the technical side. On the policy side I could see there being issues such as those @obfuscoder raised.
@aaronpk is definitely more qualified than me to anticipate security flaws with the idea. For replay attacks, what specific scenario are you thinking of?
One vulnerability this might open up is a DDoS vector. If client IDs are random, you could hit the ID assertion endpoint over and over forcing the IdP to mint a lot of tokens. You can rate limit by IP (and I suspect this would be sufficient), but won't be able to rate limit by client ID anymore.
Without getting into the technical bits of the proposal, this is starting to sound a lot like the discussions in the "wallet" space which provide this same kind of privacy property.
In the wallet world, an "issuer" issues credentials that are "held" by the wallet, and then "presented" to an RP. So by definition, in this model, the IdP never knows which RPs the credentials are presented to.
The reason I bring this up is that I think any discussion of this privacy property should happen in conjunction with the wallet discussions, rather than trying to be shoehorned on top of OAuth/OpenID Connect.
Without getting into the technical bits of the proposal, this is starting to sound a lot like the discussions in the "wallet" space which provide this same kind of privacy property.
That occurred to me too. @anderspitman , FWIW, there is a browser API for that too:
https://wicg.github.io/digital-credentials/
We don't know yet how the DC API relates to the FedCM API, but as we move along, we are interested in figuring that out!
I guess there are two parts to the suggestion here:
client_id
infrastructure is used by the IDP, not something the browser needs. If the IDP is RP-agnostic then in theory its PP/TOS can be some 'general' links, although there was an issue about letting the RP provide those directly.One thing I had missed is that the PP/TOS links are optional (as is the entire client metadata endpoint). Just leaving it out should solve that part.
As for the other, why is it necessary to enforce CORS on the assertion endpoint? Why does the IdP care who it's asserting to? This seems like a violation of least privilege.
I appreciate that the digital credentials spec may be more in line with this type of functionality, but the reality is that FedCM is the protocol that may actually get widespread adoption in the authentication space. For example, are Google and other social login providers likely to implement authentication support on top of digital credentials?
One of the core reasons LastLogin exists is to create a privacy barrier between upstream IdPs and RPs, so that IdPs only know that the user is logging in to LastLogin, and not what RPs they're using. But this requires users to trust LastLogin not to abuse that information, instead of trusting IdPs. I'd prefer if they didn't have to trust LastLogin at all. But if the browser is going to be sending the Origin header whether I want it or not then they have to trust me.
As a potential implementer and supporter of FedCM, this is a serious drawback for me.
As for the other, why is it necessary to enforce CORS on the assertion endpoint? Why does the IdP care who it's asserting to? This seems like a violation of least privilege.
We're sharing the contents of a cross-origin credentialed fetch with the RP, so the IdP must explicitly agree with sharing that information. This is how the web works. Also, the IdP is sharing user information with the RP, so my intuition is that most IdPs do care who they share it to. I'm curious though, how does LastLogin force itself to be blind to the RP requesting the credential nowadays?
There was previously some discussion about a cached FedCM version. Perhaps the IdP could 'store' some ID assertions, and when FedCM is invoked the RP could receive the stored value. If there is no credentialed fetch at that time then we can avoid the IDP knowing who the RP is. Would something like that work?
While I'm sympathetic to the privacy concerns, I really think the digital credentials API is the better place for that kind of thing.
In the consumer world of "sign in with google/facebook/etc", the IdPs absolutely want to know where the user is signing in, and they don't even support the concept of an unregistered client using the API.
In the enterprise world, the enterprise IdPs also absolutely want to know where the user is signing in, and also limits the apps to which a user can use their enterprise identity.
A similar concern is in open banking and research + education.
And here's the really funny part, if I'm bringing my own IDP as a user (see FedCM for IndieAuth), I as a user also want to know which RPs I've used my identity at.
@npm1:
We're sharing the contents of a cross-origin credentialed fetch with the RP, so the IdP must explicitly agree with sharing that information
IMO with FedCM it's actually a subtly different situation. See my comment above. The IdP isn't sharing information directly with the RP. It's sharing information with the browser, which has full control over what to give to the RP, and knowledge of what has been consented to be shared. So the IdP only has to trust the user agent. But that's necessary anyway because the browser could lie about the RP in the first place.
How does LastLogin force itself to be blind to the RP requesting the credential nowadays
It can't. All I can do is not store that information (all login data is stored client-side in JWTs), point to the code, and hope people trust me when I say that's what I'm running. That's one reason why I'm pushing for more privacy-oriented protocols.
There was previously some discussion about a cached FedCM version. Perhaps the IdP could 'store' some ID assertions, and when FedCM is invoked the RP could receive the stored value. If there is no credentialed fetch at that time then we can avoid the IDP knowing who the RP is. Would something like that work?
This would make it so the IdP doesn't always know when I'm using an RP, but they would still have a complete list of the RPs I use, right?
@aaronpk:
IdPs absolutely want to know where the user is signing in, and they don't even support the concept of an unregistered client using the API
If we can't allow users to opt out of this sort of tracking, can we at least make it possible for IdPs to opt out? As it currently stands, I think I just need a way to tell the browser not to send me the Origin header, and maybe a way for RPs to specifically request privacy-focused IdPs (or at least indicate that an IdP is incapable of tracking the login).
On the ID assertion endpoint, the IdP may use the random client_id in its assertion process. For example, it might place the client_id in the aud property of an OIDC ID token. This allows the RP to verify that the token is intended for it.
And what would prevent evil.example.com
from requesting a token for good.example.com
, when then IdP can't very the client_id
/ aud
against an origin? You are even increasing the attack surface with this. You need to somehow validate, that your token is actually being delivered to the correct client. This would only work, if the user would not have any control over the client_id
and FedCM sets this automatically. But we already have this mechanism with the Origin
header.
Edit:
And when you choose a random client_id each time, it means you would need to keep a state for each single login request for each user and you can't verify any token like you would do it now. This means with each API request, you need to look up that random state / aud
in some DB and make sure, that it actually belongs to you.
And what would prevent
evil.example.com
from requesting a token forgood.example.com
, when then IdP can't very theclient_id
/aud
against an origin? You are even increasing the attack surface with this. You need to somehow validate, that your token is actually being delivered to the correct client. This would only work, if the user would not have any control over theclient_id
and FedCM sets this automatically. But we already have this mechanism with theOrigin
header.
This is a valid point. You could use the client_id like a nonce (or use the built-in FedCM nonce), but you're still vulnerable to evil.example.com requesting a client_id from good.example.com and playing middle man. The best solution I have so far would be to use PKCE and do the full authorization code flow, so evil.example.com never has a chance to see the PKCE code verifier.
The problem is that now the IdP is receiving requests directly from RPs. This is still a big improvement, but an IdP could do IP correlation and likely determine the IPs of at least some RPs. So RPs would have to use VPNs/Tor/etc for those requests to preserve user privacy, which is definitely not ideal. Going to need to give this some more thought.
And when you choose a random client_id each time, it means you would need to keep a state for each single login request for each user and you can't verify any token like you would do it now. This means with each API request, you need to look up that random state /
aud
in some DB and make sure, that it actually belongs to you.
Note sure I understand what you mean here. Are you talking about the IdP side or the RP side? For the IdP side, I don't need access tokens or additional APIs beyond the ID token. LastLogin (and likely other privacy-focused IdPs) is essentially there to vouch that X user controlled Y ID (typically an email address) at time Z. Narrow scope increases security and reduces the need for trust. You can think of what I'm aiming for as conceptually very similar to Mozilla Persona.
If you're talking about the RP side, once the IdP has asserted identity, you're going to create your own session anyway based off that assertion.
Or am I misunderstanding you?
The best solution I have so far would be to use PKCE and do the full authorization code flow, so evil.example.com never has a chance to see the PKCE code verifier.
PKCE is useless, when you are not validating the origin or redirect uri and it would not even be a MITM, because evil.example.com could do the whole flow from start to finish and the IdP would not even notice, because the client is not confidential and therefore can't validate a client_secret
. When evil
then finished the whole flow and received a token that is valid for your client only, it can potentially use it to do API requests and even your API would not notice it, because the validation would be ok.
Note sure I understand what you mean here. Are you talking about the IdP side or the RP side? For the IdP side, I don't need access tokens or additional APIs beyond the ID token.
I mean the RP. It might be the case, that you in your case only care about the id_token
, but what about all the other cases? Additionally, in that case you could only use the id_token
once directly when you received it and you would need to implement other machanism to verify the validity of the token / session, because you would not be able to verify the token with subsequent requests .
If you're talking about the RP side, once the IdP has asserted identity, you're going to create your own session anyway based off that assertion.
That might be the case for you, but what if the client simply wants to use an access_token
? And then, why even bother creating a JWT in the first place, when you are not validating it. Then a simple JSON response without any signature would do the trick as well and be a lot faster and more efficient at the same time.
I think the FedCM should be defined in a way that it can be used in a lot of szenarios, not just the most basic ones.
PKCE is useless, when you are not validating the origin or redirect uri and it would not even be a MITM, because evil.example.com could do the whole flow from start to finish and the IdP would not even notice, because the client is not confidential and therefore can't validate a client_secret. When evil then finished the whole flow and received a token that is valid for your client only, it can potentially use it to do API requests and even your API would not notice it, because the validation would be ok.
This is how PKCE helps:
evil.example.com never knows the PKCE verifier, so it can't use the code to retrieve the token. Only the RP backend can do that. It's true that evil.example.com could create a token for the same client ID, but what would it do with it? The RP isn't going to trust tokens coming from the frontend, only authorization codes. And if evil.example.com gives it an evil code, the PKCE verifier isn't going to match.
It might be the case, that you in your case only care about the id_token, but what about all the other cases?
That might be the case for you, but what if the client simply wants to use an access_token?
I think the FedCM should be defined in a way that it can be used in a lot of szenarios, not just the most basic ones.
Other cases are already covered by the current default FedCM design, which sends the Origin header. I'm not asking to remove this functionality, simply for a way to opt out of this for privacy-focused IdPs that simply want to provide identity, and not a bunch of other functionality.
And then, why even bother creating a JWT in the first place, when you are not validating it. Then a simple JSON response without any signature would do the trick as well
You are correct that once you switch to the 3 legged authorization code flow and if you don't need to use the tokens more than to assert identity in that moment, there's not much point in using JWTs.
evil.example.com never knows the PKCE verifier, so it can't use the code to retrieve the token. Only the RP backend can do that. It's true that evil.example.com could create a token for the same client ID, but what would it do with it? The RP isn't going to trust tokens coming from the frontend, only authorization codes. And if evil.example.com gives it an evil code, the PKCE verifier isn't going to match.
I know what PKCE does and how it works, that was not the point. I was saying it is useless, when you accept requests from any origin or if you simply don't care where they are coming from.
I get it that you in your case only use the id_token
once directly after you received it from the IdP directly and then throw the token away. In that case it's fine. But in all others it's not, because usually you request a token which you then will further use for protected API endpoints. And in these cases, it would be the worst when any site could request a token for your API. PKCE doesn't help at all in that scenario.
When you end up on https://gooogle.com
which you reached via a link from your email and log in, it could request a token for https://google.com
and if you are not careful, your are screwed. If the IdP doesn't validate the origin, it would send an https://google.com
token to https://gooogle.com
and every part of the chain would be "happy" about it. The attacker probably the most, because he owns your account from that point on.
I think adding something like this is dangerous if used incorrectly and it would be very easy to screw that up. JWT's for instance have design flaws which actually make them kind of insecure by design. You can mess up the validation very easily if you don't know what you are doing and I think such issues should be avoided as much as possible with new designs.
I think I might have given the impression that I'm confident I'm right about this. Definitely not the case! But I'm really hopeful that what I'm trying to do could be possible, with a reasonable set of tradeoffs. I've already learned a lot from our back and forth, so thank you for that, and for your patience.
I don't quite understand your google attack example above. Could you run through an example interaction for the id_token
-only case, that shows at what point the attacker creates a session on the target app, and how they were able to get the app to trust the ID token?
Note that for anonymous login itself I don't consider leaking the ID token itself a security problem. These tokens would carry minimal data (most likely only an ID, such as an email address or better yet a public WebID). Users who care enough about privacy to use such a system would have to understand that any app can request a login from them at any time. Most importantly, the IdP UI would say something along the lines of "An anonymous app wants to log you in. This could expose your email address to an untrusted app. Only use a public email address for this type of login."
I get that there are big tradeoffs here, enough so that I'm not certain it would find much use in practice. But my instincts tell me that the value of privacy is high enough that some would be willing to pay it.
I think adding something like this is dangerous if used incorrectly and it would be very easy to screw that up
I agree you have to careful what you add to systems, because everything will be used incorrect at some point. But I think the risk in this case is relatively small. First, there's not much incentive for most IdPs to care about this. Generally implementers look to take the easiest path. So you could make it extra worth to implement this, which should guard against accidental usage. Even having a hard-coded list of IdPs in browsers would be better than nothing.
I don't quite understand your google attack example above. Could you run through an example interaction for the id_token-only case, that shows at what point the attacker creates a session on the target app, and how they were able to get the app to trust the ID token?
When you are doing it exactly like in your case, where you fetch the id_token
directly (important) from the IdP without any party in between, you then use it once and throw it away, you're fine with not being able to validate the token. TLS guarantees that you can get the token only from the IdP itself and that no one has been tampering with it.
The above scenario will become a problem as soon as you are doing anything else, like for instance fetch the token from the UI first and then forward it to your backend, or if you have endpoints that actually use a given token, no matter what type.
Note that for anonymous login itself I don't consider leaking the ID token itself a security problem.
It may not be a security problem, but I guess its a way higher privacy issue than having your IdP know the origin where you logged into? id_tokens
by design carry personal user data, and most often its not just minimal data. They may carry your full name, address, phone number, and so on.
Generally implementers look to take the easiest path.
That's exactly the problem, because the easiest path is usually not very secure. In OIDC for instance the easiest would be to use the implicit flow, which you really should never do these days.
Even having a hard-coded list of IdPs in browsers would be better than nothing.
I think this would make things a lot worse tbh, but this is another topic.
@sebadob I've thought about this some more and I think you're actually right that this is trivial to MITM:
And that's game. I think I also understand what you meant saying the PKCE code is useless. The problem isn't making sure the app.com backend can trust where it got the ID token. The problem is that it can't trust where it got the authorization code because the app.com frontend is easy to spoof. The only solution I know so far is (drumroll...) for the IdP to verify that the client it presents to the user matches the Origin header of the app that calls the ID assertion endpoint.
So up to this point I'd say I'm convinced this is a bad idea. Maybe there is some way to make this secure, but this discussion also raised some real UX issues. Asking users to log in to anonymous apps seems gross. There's also a lot of useful features you're throwing away the ability to use.
At the end of the day, I think @sebadob has the correct answer: choose an IdP you trust. FedCM is on track to enable users to have whatever IdP they want, so this is possible. Taken to its logical conclusion, users can self-host their own IdP, which should provide everything I need without sacrificing features.
Thanks everyone for your time on this and especially @sebadob.
A stated primary goal of FedCM is to preserve privacy. However, as currently designed it exposes to the IdP a list of every RP the user logs in to. This information is primarily leaked by sending the
client_id
parameter, which is expected to be associated with an RP, to the IdP.This information shouldn't be necessary for an identity service. A notable example is Mozilla Persona, which allowed users to have their IdP attest their identity (in that case defined as control over an email address) to any app without the IdP knowing what apps were using that information.
It would be possible for a FedCM IdP to accomplish something similar with the current defined endpoints, but would require violating the spec. It could look something like this:
navigator.credentials.get
call, the RP uses a random value for theclient_id
instead of a value that can be associated with the RP.client_id
and returns fake URLs for privacy policy and TOS. Note that there is a discussion about changing the way these endpoints are provided anyway. See https://github.com/fedidcg/FedCM/issues/581#issuecomment-2123068427client_id
in its assertion process. For example, it might place theclient_id
in theaud
property of an OIDC ID token. This allows the RP to verify that the token is intended for it.Is there any chance of closing this hole, either by changing the wording of the spec to remove the requirement for the IdP to be able to identify the RP, or (even better) by changing the endpoints to not even use
client_id
?