CSRF token ideas - Githubissues

Alcaro commented 4 years ago

Future work: I haven't implemented yet this change since I'm still finding an elegant way to generate, use and clean up CSRF tokens.

The most elegant cleanup procedure is the one that doesn't exist.

You have the session ID. If it satisfies all constraints except not being the session ID, use the session's sha256 (possibly combined with a constant (edit: server-side hardcoded) but attacker-unknown string, so attackers can't bruteforce the session ID that way).

If you want your CSRF tokens to automatically expire (you most likely do), add a timestamp to the sha256 input. Use the unencrypted timestamp plus the sha256 result as your token.

namibj commented 4 years ago

@Alcaro This is called HMAC. It's a primitive that takes data and a key, and generates a constant-size output as a trapdoor function. Don't manually concatenate input to hashes if you don't understand lenght-extension attacks.

Alcaro commented 4 years ago

Good point. While I believe the attacker has too little control over any input to perform such an attack (constant string only exists server-side (I was unclear above), session is server-issued, timestamp must be numeric and in a narrow range), using well-known crypto mechanisms is still a wise move. I forgot HMAC is a thing.

(Though it's not completely clear to me what would be key and what would be text, given that there are three inputs - the constant key, session ID, and timestamp. Append server-side key and session for use as key? Use two of them as HMAC input, then throw the resulting hash into another HMAC, along with the third value?)

namibj commented 4 years ago

(Though it's not completely clear to me what would be key and what would be text, given that there are three inputs - the constant key, session ID, and timestamp. Append server-side key and session for use as key? Use two of them as HMAC input, then throw the resulting hash into another HMAC, along with the third value?)

You'd concatenate the timestamp in a fixed-length encoding with the session ID, and use a secret server-side key as the key. You'd want to rotate the latter, however, but that shouldn't be particularily difficult.

fpereiro commented 4 years ago

Hi @Alcaro and @namibj ! Thanks so much for your input. I'll be working on this problem on the upcoming week and will post updates soon.

fpereiro commented 4 years ago

Hi again! Just wrote down a proposed approach. I posted it as a question in HN here: https://news.ycombinator.com/item?id=22268152. Below is my long-winded original posting (which I edited down in HN because of the 2k character limit). Hopefully I'll get some extra feedback and come with a definitive approach.

Hi there! Here's a question for those with experience writing and/or auditing the auth flows of a web application.

I recently decided to move my session cookies to HttpOnly, which means that they won't be able to be read by client-side javascript. This mitigates the damage of any XSS the app might suffer.

However, for CSRF prevention, I was sending said session cookie as an extra field with every POST request (this is called the double submit cookie pattern: https://medium.com/cross-site-request-forgery-csrf/double-submit-cookie-pattern-65bb71d80d9f). Now that the session is not accessible from javascript, I need to create a different token/secret that will function as a CSRF token.

Based on great feedback from the community (see https://news.ycombinator.com/item?id=22209588 and https://github.com/fpereiro/backendlore/issues/12), I'm considering the following approach:

On every successful login, create a new secret/token (using the same crypto mechanism I use to create the session secret, but a different secret altogether) and store it on the database, tied to the session itself. Set both the session and the CSRF token to expire at the same time.
Every time I get a request with a valid session, renew the life of both the session AND the associated CSRF token.
On every successful login, return the CSRF token in the body so that it can be read by client-side javascript.
Set up an endpoint where the client can retrieve the associated CSRF token for its session. If no session is present (or the session has expired), return a 403 code. This also solves the problem of letting the client-side app know whether the user is logged in or not (I would hit this GET /csrf endpoint when the javascript loads to determine whether there's a valid session available).

My understanding is that, as long as the browser supports Same-Origin Policy (https://en.wikipedia.org/wiki/Same-origin_policy), a CSRF attacker could not submit a GET request to my server and obtain the result. I see here (https://en.wikipedia.org/wiki/Same-origin_policy#History) that this feature is supported as of Netscape Navigator 2, which is enough for me :). (incidentally, the HttpOnly attribute is supported as of IE6 SP1 and Safari 4: https://stackoverflow.com/questions/528405/which-browsers-do-support-httponly-cookies).

I'm also aware (thanks to @procombo, whom I hopefully understood correctly) that by setting the SameSite cookie attribute to "strict", it's possible to avoid creating CSRF tokens, but this is only on new-ish browsers (https://caniuse.com/#feat=same-site-cookie-attribute) - but I'd like to support old versions of IE, so I don't mind the extra complexity involving CSRF tokens as long as the approach I outlined above is tenable from a security perspective.

If you see any security issues in the above scheme - or if you use a similar scheme and know it to be secure - please let me know. Thank you very much for your feedback!

namibj commented 4 years ago

There is actually a far simpler solution to this CSRF-prevention-problem: use https and the Referer Header. This won't be faked by the UA, and it will be efficient. CSRF tokens are from the age of http proxies stripping the Referer Header, but Browsers shouldn't strip it themselves in any https-only setup where the Referrer-Policy: same-origin Header is still sent. At worst they should ideally leave the Origin: Header in tact, and at the very least comply with CORS Headers. If you have a user that isn't willing to send the Referer Header, and isn't willing to e.g. unlock his account for CORS GET/HEAD w.r.t. billing / rate limits (you might charge for GET requests or do account/session-specific rate limiting, but block CORS request to these resources to not have to charge for them), just refuse to give in. POST requests get their own Origin Header anyways.

Maybe tell the customer to force Referrer-Policy: same-origin and additionally strip the path if they are that picky.

Edit: I recently wrote a PHP Symfony backend that uses the mentioned Referrer-Policy: same-origin + force-matching between Referer Header and Host Header for CSRF. It didn't hit production, but I'm rather focused on pre-emptive, yet efficient security. Complexity can break, and I prefer this over the very, very mild impact on conversion rate (though the lack of bloat, and ability to semi-SPA it with a non-private HTTP cache (for search results & co) does have a positive impact on conversions) from endpoint protection and/or centralized corporate https proxies (both MitM) stripping the Referer Header.

Note: I know PHP isn't the best choice, but there was outside influence and my Rust is still too weak (and the nice frameworks are quite unstable) to attempt to start that kind of backend with Rust right now.

fpereiro commented 4 years ago

Hi @namibj . Thanks for your suggestion!

I did a bit more research about using Referer or Origin headers. In particular, I read this excellent SO answer (https://stackoverflow.com/a/9283830) and the Verifying origin with standard headers section on the OWASP's CSRF prevention Cheatsheet (https://owasp.org/www-project-cheat-sheets/cheatsheets/Cross-Site_Request_Forgery_Prevention_Cheat_Sheet.html).

The solution you propose would work most of the time and would be perfectly acceptable. However, there's a few aspects of it that make me still prefer CSRF tokens: 1) having to keep track of the valid referer on the server configuration (so that I have a proper value to compare against), which might get tricky in certain situations; 2) requiring API clients (besides browsers) to reference this field, which they'd have to copy into a global variable somewhere; 3) the fact that user agents can omit sending these headers without violating HTTP standards (detailed info at the bottom of the relevant section of the OWASP cheat sheet).

I recently implemented CSRF tokens using the approach I outlined above - it costed me about ~20 extra lines of code. Here's a link to the commit (all server-side code): https://github.com/altocodenl/acpic/commit/8a3e474cbcbccbece12df9940981895953a76d58

I'll add my thoughts on this to the main document soon.

Thanks again for your feedback!

namibj commented 4 years ago

I did a bit more research about using Referer or Origin headers. In particular, I read this excellent SO answer (https://stackoverflow.com/a/9283830) and the Verifying origin with standard headers section on the OWASP's CSRF prevention Cheatsheet (https://owasp.org/www-project-cheat-sheets/cheatsheets/Cross-Site_Request_Forgery_Prevention_Cheat_Sheet.html).

Nice.

The solution you propose would work most of the time and would be perfectly acceptable. However, there's a few aspects of it that make me still prefer CSRF tokens:

Fine, in principle at least.

1) having to keep track of the valid referer on the server configuration (so that I have a proper value to compare against), which might get tricky in certain situations;

Sure, could get tricky, but tokens aren't easy at all.

2) requiring API clients (besides browsers) to reference this field, which they'd have to copy into a global variable somewhere;

No. You just make them send the same (modulo formatting, if needed) as they already do in the Host header.

3) the fact that user agents can omit sending these headers without violating HTTP standards (detailed info at the bottom of the relevant section of the OWASP cheat sheet).

Sure, they can, but this already breaks so much, that one likely won't have to support this case.

I recently implemented CSRF tokens using the approach I outlined above - it costed me about ~20 extra lines of code. Here's a link to the commit (all server-side code): altocodenl/acpic@8a3e474

And it will cost you larger payloads in both directions.

I'll add my thoughts on this to the main document soon.

Nice.

Thanks again for your feedback!

Thank you!

fpereiro commented 4 years ago

Finally got around to write my conclusions here: https://github.com/fpereiro/backendlore/commit/f9deda3402c24ac8ddbb98c4f3fab679d8bfbc39 I have referenced this issue for those who might want to avoid using CSRF tokens. Thanks @Alcaro & @namibj for the discussion!

fpereiro / backendlore

CSRF token ideas #12