WICG / dbsc

Other
272 stars 19 forks source link

Require request signing for proof-of-possession #23

Open joaopenteado opened 3 months ago

joaopenteado commented 3 months ago

Hello!

If my understanding is correct, the temporary auth_cookie provided by the server is a short-lived opaque string that is used by the server to keep track of the authenticated user. The idea here seems to be to improve only the security of the process of refreshing an access token by making use of a device bound cryptographic key, initially registered when establishing a session.

This seems to me like a lost opportunity to do something truly great for the safety of the web and improving the overall security of ALL authenticated requests. A few months ago, OAuth 2.0 Demonstrating Proof-of-Posession (DPoP) got standardized to solve more or less the same issue: stolen or intercepted tokens.

Unfortunately, as it stands:

It seems like we are very close to where we want to be, but still not quite there yet.

In addition, the current proposal does not address concerns regarding replay attacks. I understand this could be "out-of-scope" and there might also be some performance implications, but I think it's important to have this discussion. Taking the incident that happened with Okta later last year as an example, when their CS platform was hijacked and a bunch of session tokens provided by admins in tickets during troubleshooting were stolen by the attackers, the current DBSC proposal would certainly made a big difference in the impact (since most session cookies would have been expired), but still not enough to cover everything (i.e. depending on how fresh the provided request was and/or how long the server set the refresh time).

joaopenteado commented 3 months ago

I can also imagine an implementation of request signing that is opt-in via some sort of flag/parameter on the final Set-Cookie response, sent also with the auth_cookie. If a flag such as is set, then the browser would also provide a signed JWT with every request, similar to the JWT present on the DPoP established in the OAuth DPoP spec.

Also relevant to the discussion: RFC 9421: HTTP Message Signatures - standardized 2 months ago.

wparad commented 3 months ago

I think the easiest and forward capatible approach would be for the browser to just automatically create the DPoP signature and send it along with the cookies on every request. The Resource Server can decide what to do at that moment.

For instance the flow could be:

The most optimal would probably to have signed cookies, signed by the service side which includes a hash of the Browser DPoP public key. Then on returning the cookie the service can verify the cookie signature and also validate that the hash of the public key matches the one in the cookie.

arnar commented 2 months ago

Proof-of-possession on every request and future integration with things like RFC 9421 are very much on our mind when proposing this API. But we also want something deployable now.

To have any value against malware, the private key needs some better protection than cookies, and on current and legacy systems this means special system APIs and even hardware that requires serialized (as in single-threaded) access. Those are currently to slow for signing each individual requsts. There are two reasons this first iteration of DBSC has the refresh mechanism and short term cookies:

  1. The browser cannot generate signatures over every request from protected private keys, simply because it is too slow.
  2. All auth stacks/platforms/frameworks/etc server side already run on cookies, and switching that to signature verification of e.g. a DPoP header is a major multi-layered migration. The refresh pattern here allows for deploying at least that without total rewrites of e.g. legacy apps, they still only deal with the presence+validation of a cookie.

Both of those are problems that we expect to go away over time (but only if there are use cases, which for the web won't happen without APIs like this first). When we can have signatures per request, I still believe an API that formalizes the session concept to the browser is necessary, i.e. where there's an explicit "start session" signal from the website. That's where key registration happens, and allows the server to associate the public key with its internal notion of the session. This was one of the things that was missing from TokenBinding, where keys are created and registered implicitly.

I can imagine two future stages to how we get there with DBSC:

  1. First, and this might be feasible on some platforms today, on refresh where the private key p-o-p is presented, instead of setting a short term cookie and instructing the browser to call us back when it expires, we embed a DH exchange for a symmetric key in the challenge/response protocol, and the symmetric key has similar malware-resistance as the private key. The instruction to the browser is then to individually sign (e.g. with a HMAC) each http request for some period of time.

  2. When we can actually sign each request with a protected private key, the DBSC session initiation still happens to register the key, but the only instruction it gives is to sign in-scope requests, e.g. with RFC 9421, from that point on until the session ends. (This can even be integrated with the transport layer, although I think there are other problems around that, e.g. around maintaining multiple sessions, etc.)

(side note: caching DPoP assertions or similar and reusing them across requests is functionally equivalent to the short-term cookie, in terms of security properties. It is easy to imagine a future DBSC instruction from the refresh endpoint that instead of saying "maintain this cookie" it says "attach this header to every request for the next X minutes")

We did explore alignment with DPoP, including consultations with the folks central to that effort. At the time, DPoP's main goal was integration with OAuth, specifically for the scenario where you have an authorization server and a resource server with an asymmetric trust relationship. This doesn't really apply for most web sessions, which are only between a browser and a single website. Therefore DPoP was at that time focused on binding refresh tokens, as the token exchange only involves the AS, and there was a more complicated story around protecting access tokens.

At the end of the day, the simple part is how to handle key pairs and sign stuff. The hard part is designing a protocol that fits existing flows/contexts (like OAuth vs. web sessions in this case), and have something that can be deployed at scale without major rewrites.

On replay: This is glossed over in some of the sequence diagrams in the explainer, but DBSC does have a notion of a server-issued challenge. Whether a challenge is required is the server's choice. Getting a challenge can either be a separate round-trip during refreshes, when that applies, or challenges can be preemptively issued via regular HTTP response headers on any given request.

joaopenteado commented 2 months ago

Proof-of-possession on every request and future integration with things like RFC 9421 are very much on our mind when proposing this API. But we also want something deployable now.

Awesome! It's nice to see that we share more or less the same vision regarding the desirable state of web authentication, especially with respect to the 2nd stage you described. Thanks for sharing the previous discussions regarding DPoP and the insight regarding the replay attack prevention.

I think prioritising smaller incremental security gains is a very reasonable approach. I can certainly see the security benefits brought to the table with the proposed standard "as is" and I also agree that adoption by service providers will far simpler and easier by just requiring changes to the initial login flow and refresh token renewal instead of reworking the entire authentication layer.

I'm also aware of the current hardware limitations on some platforms to do singing on every request, but I still think even now the performance penalty tradeoff could be worth it in specific sensitive scenarios (i.e. online banking, privileged sessions on cloud platforms or enterprise tools and so on). Players that operate in such high stakes environments are more likely to adopt and take advantage of new security standards faster than your average service provider. As I understand, some vendors already implement something akin to this on their own mobile app clients.

That is to say, I think that despite its limitations, it could be worth investing into making a beta / optional flag on top of the current standard to support these applications (and also to ease future migration) in the short term too. Even with a lot of interest from these parties, it adoption of such a change will likely take some time.

wparad commented 2 months ago

I think the key here is this:

(side note: caching DPoP assertions or similar and reusing them across requests is functionally equivalent to the short-term cookie, in terms of security properties. It is easy to imagine a future DBSC instruction from the refresh endpoint that instead of saying "maintain this cookie" it says "attach this header to every request for the next X minutes")

Maintaining the cookie is subject to attack, but if the header could be completely controlled by the browser and neither:

Arguably, if we jump directly to Adding a header automatically to every request, then adoption becomes easier not harder, since the server doesn't even need to generate the cookie in a different way.

That's so much better than the current DBSC proposal which also requires:

That's a lot of wasted code propagation in both browsers (not the client side app) AND on the service side to deal with this new cookie. Realistically if we can jump to already including a header DPoP compatible or not, it would make it soo much easier for everyone to integrate with, and we wouldn't end up with legacy implementations and cookies using a now unnecessary auth-cookie attribute that is by consensus legacy technology.

This also incredibly simplifies the flow:

That's incredibly simpler for everyone and will guarantee quick adoption.

arnar commented 2 months ago

Thanks @wparad. I don't fully understand if you are talking about attaching a static header value until the next refresh, or if you are talking about attaching a request-signing header that needs to be computed for each request. Since we already discussed the latter being infeasible for the time being, I'll assume the former. Apologies if I misunderstood.

Maintaining the cookie is subject to attack, but if the header could be completely controlled by the browser and neither:

  • accessible to the site (if the site is vulnerable to attack)
  • have to be managed by the service side

A cookie can (and should) be made unavailable to the site with standard methods, such as setting it as HttpOnly and Secure. A header value that is static between refreshes still has to reference something that the server can understand and validate as belonging to this session, and I can't see a simpler way to do that than for the refresh endpoint server to return the value it wants to use. So I don't think making it a dedicated header rather than a cookie relieves the server of any management.

Minor change on service side to validate header when it appears

Validating a new header seems at least as much work as adding validation logic for a new cookie name. It's been our experience that it's actually more work and plumbing as cookie handling is pretty standard and supported in existing frameworks. Of course, that may be specific to our backend setups.

Note also that the short-term cookie used by DBSC doesn't have to be a new cookie. You could deploy DBSC by making your existing long-lived session cookie take the role of the short-term cookie maintained by DBSC. That way you'd need zero changes to the auth stack of existing endpoints.

wparad commented 2 months ago

I meant the former "a static header that would be recomputed on the order of minutes or realistically on the same order as the current DBSC proprosal's suggested POST /securesession/refresh call frequency". I totally understand that in the future we want the TPM or at least a TPM derived key to be used to sign every request, but believe we can leave the dynamic cryptographic functionality of the data in the header to achieve this to a later point in time. We want this, but know it isn't possible today, so as long as in the future we can easily hot swap in a parameter requestSigning: true and get new headers that contain the right information, that aligns with what I believe everyone wants.

And with that then there would never need to a POST /securesession/refresh because the browser would handle the "refresh" automatically by generating a new signature however frequently requested by the configuration passed to the navigator.securesession.create method.

To the rest of the comment, I believe the original proposal of DBSC is unclear here and could use some improvements with an actual expected implementation/use case. I feel like what you are saying here is that the website tells the browser "the auth server will stick the access_token in a cookie name COOKIE_TOKEN_123, if that cookie does not exist, sending all the cookies, which presumably contains the session_token_identifier to the service endpoint to get a new COOKIE_TOKEN_123 access_token cookie". If that is true, then, I think I understand a bit more about this and want to potentially start of with the question of "what does the long term implementation look like here", and I elaborated more on this in: https://github.com/WICG/dbsc/issues/36#issuecomment-2044692402

jackevans43 commented 2 months ago

Is it necessary to sign every request? While malware is present on a user device, it can also do this. Aren't we trying to reduce the time between malware being removed and an attacker loosing the ability to pretend to be the user?

If a new token (cookie, header etc) is generated, say, every day then we've reduced the attackers access from forever to up to one day. If we sign every request it'll be reduced further to zero - but given the compromises it'll involve (e.g. probably can't use a TPM, so therefore increasing the risk malware could steal the secret key) might actually increase the attackers access after malware removal back to forever...

wparad commented 2 months ago

I'm not sure why be stuck on malware on the device. Instead it makes sense to break this down into two level:

This proposal is meant to solve the first one. If malware is completely affecting the whole device then there is never anyway to prevent an attack. if the malware already has access to that, then even removing the malware doesn't reduce the attack, since the attack could have already been carried out. If you are specifically concerned with an APT for the user's devices that has a long lived malicious behavior, that is going to require a whole different solution which is arguably and I agree, outside of this proposal

Let's talk real world use cases. The user has a malicious extension installed which can gain access to the browsers cookies. Once the malware is removed, then it will no longer have access to a real time, and cookies stolen are useless.

Let's look at another one, a malicious JavaScript package, cookies that aren't marked as httpOnly are susceptible to cookie exfltration for previous users. Upon upgrading the site to use a new package, that attacker would no longer have access to make requests on the user's behalf. Again the cookies it has stolen are worthless.

As long as the TPM is generating new JWTs every 5 minutes and the malware never has access to the private key of the TPM, even on device malware won't matter, and again, we don't have to worry about it. The scheme above with DPoP would still work, right?

arnar commented 2 months ago

This proposal is meant to solve the first one

If by "this proposal" you mean DBSC, then that's incorrect. DBSC is meant to address malware that has the same privileges as the browser itself, which necessarily includes full access to sign arbitrary data with the session keys. The real world cases we want to address is native info-stealer malware, not JS compromises within the web context.

If malware is completely affecting the whole device then there is never anyway to prevent an attack. if the malware already has access to that, then even removing the malware doesn't reduce the attack, since the attack could have already been carried out.

The point is to prevent malware from exfiltrating sessions. Quoting our blog post:

By binding authentication sessions to the device, DBSC aims to disrupt the cookie theft industry since exfiltrating these cookies will no longer have any value. We think this will substantially reduce the success rate of cookie theft malware. Attackers would be forced to act locally on the device, which makes on-device detection and cleanup more effective, both for anti-virus software as well as for enterprise managed devices.

wparad commented 2 months ago

I was using JS compromises as an easy to understand example, because it simply demonstrates to readers what an attack looks like. Most JS compromises are indistinguishable from Browser or OS compromises for many scenarios, but of course the problems we are trying to solve here can't be limited to JS compromises.

As I shared in https://github.com/WICG/dbsc/issues/46#issuecomment-2049140588

As long as there is malware on the device there is no way to prevent it from impersonating the browser or stealing all the information that the browser has. I'm calling that the corner case that can't really be solved, or rather it can only be solved if the device supported a way to securely identify the process and executable that is calling the TPM in a unique way so that malware wouldn't be able to call it AND it would also require that every request sent to the Resource Server was signed by the TPM.

That is what I am calling the corner case. Because as soon at the malware is removed then it will no longer have access to the TPM. Which means all we need to do is the same thing we do in every other JWT creation situation:

Leeway - Pass a required issued at property, which is verified by the server to be generated by a clock that is in the last 10 seconds. Expiry - Pass a required expiry property, which has an expiry that is in the next 5 minutes. Of course both of those should be configurable by site requesting the TPM and not be the service.

Anything (such as a nonce) generated by the service doesn't help us here because if the Malware is on the device then it also has access to everything generated by the TPM, so even if the TPM refused to sign a duplicate request, the Malware could just steal the existing request and inject in a fake result back into the browser. And if we really wanted a nonce, then the session ID already is that nonce, I'm suggesting that asking the server to continually generate a nonce is the unnecessary extra amount of work because it doesn't help deal with the threat model proposed.

It's absolutely true that DBSC forces the attacker to operate locally on the User Device, but so does every other strategy that attempts to utilize a TPM in literally anyway as long as these two invariant are utilized: Leeway && Expiry. So what's important isn't unnecessary passing back and forth between the service side and the user device, but rather the client side Signing any time bound data, irrespective of whether that time binding was generated by the Service, the User Device, or the TPM, and that's because in the case of the latter two, we can easily suggest validation strategies, the same exact strategies that exist for the validation of all JWT access tokens, which would make these extra calls to the Service side unnecessary. Doesn't it?

Or is there an attack you are thinking about that validating the leeway and expiry bound in the token would not be sufficient?

jackevans43 commented 2 months ago

To prove current possession of the private key, you'll need to sign some piece of data from an external source that you couldn't have predicted. Without this, malware previously on the device could have generated tokens that would be valid in the future and sent to an attacker for later use.

wparad commented 2 months ago

correct, but in this case, the argument is that you already have something that cannot be predicted, the value of the Session Credential. Which often in itself is already a nonce. The requirement to make a second call to get a second nonce, is what I'm failing to articulate eloquently as unnecessary.

Sora2455 commented 2 months ago

Consider the following scenario:

  1. Victim starts a session on DBSC-enabled site
  2. Victim's computer is infected with malware, which steals their cookies
  3. Malware is removed from victim's computer
  4. Attacker uses stolen cookies, which are already signed

That implementation of DBSC provides no protection.

As above, but the server requires that the signed data includes the current, say, hour:

  1. Victim starts a session on DBSC-enabled site
  2. Victim's computer is infected with malware, which steals their cookies
  3. Attacker recognises signing strategy, and signs 100 cookies for the next 100 hours
  4. Malware is removed from victim's computer
  5. Attacker uses stolen cookies, of which they have valid cookies for the next 100 hours

This provides almost no protection. It'll catch out attackers who are unfamiliar with this site's signing scheme, but given the popularity of high-value targets (e.g. social media) and the nature of malware-as-a-service, their malware toolkit will likely pre-generate more than enough cookies for them to carry out whatever attack they have in mind.

As above, but the server requires that the signed data includes a cryptographically random challenge generated every hour:

  1. Victim starts a session on DBSC-enabled site
  2. Victim's computer is infected with malware, which steals their cookies
  3. Attacker recognises signing strategy, but cannot sign cookies in advance as future challenges have not been generated yet
  4. Malware is removed from victim's computer
  5. Attacker uses stolen cookies until challenge expires, at which point stolen cookies are useless

That is the protection that DBCS is aiming to achieve. If the signed data doesn't include a truly random value that changes regularly, then the attacker can either continue using the cookies they stole without needing to change them, or they can pre-generate cookies which will be valid in the future while they still have access to the victim's computer.

wparad commented 2 months ago

You have well laid out the attacks, nice job on that.

However there is a flaw in the understanding here. This statement from "Scenario 2" is incorrect:

Attacker recognises signing strategy, and signs 100 cookies for the next 100 hours

The Attacker can sign cookies for the next 100 hours, but since the session credential will be rotated in ~hour, the signatures valid after that are worthless, because the signature should contain the hash of the session credential.

And this statement from "Scenario 3" isn't entirely correct:

Attacker uses stolen cookies until challenge expires, at which point stolen cookies are useless

And it is the nuance that the "Scenario 2" strategy actually solves for. In the current draft of the proposal the Attacker can use the access token cookies forever! Only the stolen session credential and associated signature have limited usage until the signature or session credential expires. The Access Token generated from the Session Credential has a life of 100 hours, then the attacker will have access for the next 100 hours. Session Credentials are being used to generate access tokens.

The reason the draft of the DBSC has gone for this for my understanding is the difficulty of providing a signature given the current strategy on every request due to the fact that the usage of the TMP here is not happening to sign the hash of the Session Credential, but instead just sent along with the request. Since we actually want to have a signature on every request, this would require calling the TPM once every ~hours every time the Session Credential changes, and then pass this signature in every request. AND if we are doing that then the same strategy will prevent the vulnerability specified in "Scenario 2".

So I 100% agree that there is a vulnerability with "Scenario 2", if the hash(Session Credential) is not signed, but if it is, then the same vulnerability you are pointing out there exists with "Scenario 3", and if we fix that, then "Scenario 2" is also fixed. Which means there's no benefit to requiring additional complexity on the service side which "Scenario 3" would required.

I think I explained that right, but it's possible there is still a mistake in my reasoning somewhere (it's hard to think about this without an actual implementation)

arnar commented 2 months ago

(with the caveat that I might be misunderstanding what you call "the session credential")

In the current draft of the proposal the Attacker can use the access token cookies forever!

Does "the proposal" refer to the DBSC as described in the explainer, or one of the proposals in this issue and/or #46?

For DBSC as described in the explainer, then there are only two credentials (things that give access), and all other values are identifiers that are needed to run the protocols but don't themselves give access. The credentials are:

  1. The private key, which is the device-bound thing. (i.e. it has some protection against exfiltration malware that has browser-level privileges on the system). This is meant to live as long as the session does, which could be forever.

  2. The short term cookies issued by the refresh endpoint. Getting one requires a fresh proof-of-posession of the private key, and these cookies are meant to live on the order of minutes to hours, depending on the need.

The cookies are bearer credentials, but have the short lifetime baked in by the issuing server, in exactly whatever manner that is done on cookies today (e.g. they are encrypted structures containing a server issued timestamp, TTL, and/or expiration time). An attacker that steals those cookies can only use them for those few minutes until they expire. After that the server won't accept them as credentials.

(Sorry if I'm stating the obvious. I do have the feeling that the ideas in the explainer aren't coming across, which is probably our fault.)

wparad commented 2 months ago

Does "the proposal" refer to the DBSC as described in the explainer, or one of the proposals in this issue and/or https://github.com/WICG/dbsc/issues/46? Current DBSC proposal.

It sounds like there is an assumption that DBSC will be the session credential. But this isn't in reality how many systems work, they already have a session management solution. Right now the existing state of the world is that sites have two things:

With the introduction of DBSC we have third thing, which is the Private Key TMP credential, which signs the Session ID.

I hope that makes it clearer what I was referencing.

Sora2455 commented 2 months ago

Okay, in my case I was assuming the most basic of set-ups, the session credential.