mozilla / fxa-oauth-server

OAuth server for Firefox Accounts
48 stars 40 forks source link

Use bearer token as signing cred #7

Closed seanmonstar closed 8 years ago

seanmonstar commented 10 years ago

Current: Imagine foo.com gets a token for Alice, with permissions read profile, write bankaccount. When making a request to the profile server, passing this token to it also happens to give the profile server permission to write to bankaccount of Alice.

Instead, it'd be great if instead of passing the token to attached services, the token was used to sign a request and append the hmac. The oauth server could verify the hmac, without the attached service ever getting their robotic hands on the sacred token.

seanmonstar commented 10 years ago

cc @warner

warner commented 10 years ago

Yeah, that'd be cool. It probably takes us outside the OAuth specifications, though, and differs from the other OAuth-consuming services that we've looked at (google, github). It also means that the fxa-oauth-server must store the actual token, instead of a hash thereof, which means that an attacker who grabs the oauth-server's database wins. So in some way it's a tradeoff of security against wire-level eavesdroppers and third-party-services, versus database attacks.

seanmonstar commented 10 years ago

I'd prefer a properly-secure solution over implementing a less-secure OAuth2 spec.

That sucks about storing the keys though. Is there no way to avoid that, short of generating some RSA keys and handing over the private key?

warner commented 10 years ago

Yeah, there are basically three options:

seanmonstar commented 10 years ago

Let's think crazy for a little bit here.

  1. What if the /v1/token endpoint accepted a public key parameter, and the client's submitted a public key. Then the sign their requests with their key, and we can verify on our side. Too many bits flying around? Too slow? Too insane?
  2. Or: what if the client asked for a sort of assertion from us? Like, GET /v1/assertion?url=blah, and we send back a 1-ish use code that they can send to the attached service, we'll verify, but it can't be used again, and it lives short enough such that db compromise won't give you anything useful.
warner commented 10 years ago

I love crazy! :) I like the first one: the client (aka RP, aka "the app") can trade in their bearer token for the ability to sign requests instead.

The message sizes and speed are manageable. My favorite public-key signature algorithm (Ed25519) takes only a few hundred microseconds to sign and verify messages (as long as you're using a C implementation), and the keys/signatures are tiny (32 byte private keys, 32 byte pubkeys, 64 byte signatures).

The main problem is convincing people to use it. New == scary :). OAuth 1.0 had a lot more crypto, still mostly symmetric but enough that nobody got it right, and OAuth 2.0 threw a lot of it away.

But maybe we could start small, with internal apps, and provide them with some libraries.

seanmonstar commented 10 years ago

Wonderful! I was afraid we'd need 1024-byte DSA keys, and the slowness that comes with. We can get away with 32-byte keys? Yes please!

How would they trade their bear token? A third endpoint? /v1/key?token=previousToken&key=myPublicKey?

I'm not sure why we'd even need tokens, then. I'd think if we used keys, we could just change /v1/token to require the client to POST code and pubkey, and we'd return back 200. Then, we'd need to specify how they sign requests, and our /v1/verify endpoint could check the signature using the public key.

seanmonstar commented 10 years ago

Or, if it seems annoying for clients to figure out how to generate these keys, we could generate the pair from the /v1/token endpoint, and send them back the private key instead of token. But then they'd have to trust we threw it away, and that no one someone managed to sniff it on the wire (insert Han Solo's "she'll hold together" quote regarding SSL).

warner commented 10 years ago

Yeah, it'd probably be best to skip the token entirely: submit code+pubkey, oauth-server remembers (pubkey->uid+scopes). The final API requests should sign everything that could affect the operation, and include the pubkey along with the request (since it's nearly as short as any other keyid you might include). Then the API handler could do one of:

If we go with that last one, we could have the service cache the response for a while, or (for faster/more-accurate revocation) subscribe to hear about the pubkey being revoked. It might be ok to submit just the pubkey and get back a response with all the scopes that key is allowed, and do the scope checks locally, although we should think through the privacy implications.

I don't know how to do it yet, but it'd be swell if we could build something here that achieved Persona's privacy goals. The initial OAuth redirect flow is probably more of an obstacle than the final request verification process, though.

Using pubkeys would reduce the "window of vulnerability" (of compromised creds) to the initial OAuth redirect flow. Once the code was redeemed, then nothing stored on the server or flowing over the wire would have any power (e.g. imagine if somebody got hold of a day-old backup tape of the oauth-server's database, and how we'd make that a non-event: all the codes have expired, all that's left are pubkeys). Maybe we can invent even crazier approaches that reduce the window further: maybe the client signs a Persona assertion with the (pubkey, scopes) embedded in the signed payload, submit that to fxa-oauth-server as proof of delegation. Then the fxa-auth-server's "here is your session token" response might be the only remaining secret that crosses the wire.

The "channel-bound cookies" work that Google is doing is relevant here (mostly maintaining the usual OAuth client behavior, but tokens that arrive over the wrong TLS session aren't accepted). Going full-pubkey is cleaner in some ways, but a bigger change. But, hey, I said I love crazy :).

I'm working on easy-to-include Ed25519 signature modules (for Node and python). Current speed (on my laptop) is 1ms to sign and 3.5ms to verify. There are other bindings to the optimized code (which might require manually installing an extra library first) that should be 10x faster.

seanmonstar commented 10 years ago

So, with the faster and shorter keys, the downside is... weaker, I assume? Of course, 32 bytes can be brute-forced faster than 1024. But, is "faster" within the span of a human life?

Assuming all these things are positive, I'd love to go forward with this. It's no longer OAuth, but no need to be shackled to a spec if it's less secure.

warner commented 10 years ago

No, actually, it's way stronger. These are elliptic-curve signatures, so a 32-byte key gets you a security level of 128 bits (i.e. the computational complexity necessary to forge a signature is about 2^128). This is same as AES-128. To get the same security out of RSA requires at least a 3000-bit key (compare the "Symmetric", "Logarithm Group", and "Elliptic Curve" columns of http://www.keylength.com/en/3/).

Modern elliptic curves are awesome. The only downside is that you need somebody really smart to invent them, but that only has to be done once :-). http://ed25519.cr.yp.to/ has more details and a boatload of papers.

seanmonstar commented 10 years ago

How difficult would it be for clients to create their keypair? In whichever language they were using? Am I right in thinking there aren't many bindings for most of the popular languages (JS, Java, C, Python, Ruby, PHP, C#)?

warner commented 10 years ago

I was just looking for that: https://github.com/jedisct1/libsodium/#bindings-for-other-languages has a list. I've used the python ones, not sure about the state of other languages. We'd probably need to write a library for clients, at about the same level as HAWK, to make sure everybody serialized the requests the same way.

seanmonstar commented 10 years ago

Such as providing fxaSign(request, key) in all those languages? Yikes... What about providing an exhaustive input to output test. So anyone can easily add the test to their code, and be sure they're doing it correctly.

What do you think? Should we proceed with this? Or is it too obscure?

warner commented 10 years ago

Let's at least prototype it. One reason OAuth2 replaced OAuth1 was because v1 had too much crypto for most devs to get right, even with libraries, and there's a danger of falling into the same trap here. But I think we could build both (accept bearer token in an Authorization: header, or accept signed request) and then tell devs to prefer the signature one.

So I think the steps would be:

That last question needs to be answered for bearer tokens too.

warner commented 10 years ago

Oh, also, let's hack on this at the workweek next week.

seanmonstar commented 10 years ago

Ok, so to move forward on this: I took a deeper look at Hawk. It seems to me that we can pretty much use the entirety of Hawk, where the id is perhaps a token id, and the key is the private key generated by the client. We could look up the public key via the token, and verify the HMAC.

Does that seem sane? Are there bits of Hawk that get in the way, or make it harder for people to use, that we don't need? Or do we use it, and just provide far better documentation?

seanmonstar commented 10 years ago

Ok, so I took a look deeper at Hawk, and I can't use it exactly, because it only assumes sha1 or sha256, and I can't provide my own hashing algorithm. So I'm building a node library that uses ed25519 and some easy functions for keys and signing.

@warner I noticed that since this signature is really encrypting the entire contents, the size of the signature grows with the content. So, longer query strings, and especially to provide payload hashing, the hash part of the Auth header will start to get quite large. I was thinking that perhaps, instead the hash could be a sha256, and then that hash could be signed, so it would stay small.

id="the_token_we_gave_you"
 ts="Date.now()"
 nonce="randomBytes(4)"
 hash="ed25519.sign(sha256(blob))"
ckarlof commented 10 years ago

@seanmonstar you're correct. Typically when you generate a digital signature, you sign a fixed length hash of the content you want to sign rather than the content itself. This is often called hash-then-sign. In fact, if you don't do this with RSA, it's insecure (and you can only sign short messages): http://crypto.stackexchange.com/questions/12768/why-hash-the-message-before-signing-it-digital-signature-with-rsa

ed25519.sign(sha256(blob)) is probably fine, but since I'm not familiar with ed25519, I defer to @warner.

warner commented 10 years ago

The Ed25519 sign() function returns "signedmessage", which is the signature (64 bytes) concatenated with the message. The corresponding verify() function takes signedmessage and returns "message" iff the signature was correct.

Personally, I prefer to defer even looking at the message until the signature has been verified, and a "signedmessage"-style API enforces this quite naturally. HAWK, on the other hand, is more in the "sign the canonical representation" style of APIs, where most of the stuff it's signing is delivered in other places: the HTTP command line, other headers, etc.

To minimize the changes we're making to HAWK, we could just split off the 64-byte signature from the return value of sign() (ed25519.sign(xx)[:64]), and use it in place of the "mac" field.

So if HAWK is roughly:

$hash = SHA256("hawk.1.payload"+$body)
$normalized = "hawk.1.header" + timestamp/nonce/URIstuff + $hash
$mac = HMAC-SHA256(key, $normalized)
Authorization: Hawk id= ts= nonce= hash= mac=$mac

Then we could do:

$hash = SHA256("NAME.1.payload"+$body)
$normalized = "NAME.1.header" + timestamp/nonce/URIstuff + $hash
$sig = ed25519.sign($normalized)[:64]
Authorization: NAME pubkey= ts= nonce= hash= sig=$sig

And the receiver would need to concatenate $sig with $normalized before feeding it into ed25519.verify. I'd suggest delivering pubkey= as hex, since it's pretty short.

The verify-then-parse approach would basically use "Authorization: NAME pubkey= sigmsg=", and the receiver would get $normalized out of verify(), and then they'd need to parse it into timestamp/nonce/URIstuff and compare values against the rest of the HTTP request to make sure they matched.

Of course, the Actually Right way to do it would be to just put everything that could affect the API call (method name + args) into a JSON blob, sign that, then ship length+pubkey+sigmsg over a raw TLS connection: that would ensure that no aspect of the receiver's behavior could be affected by something outside the signed message. But it would look pretty weird to our HTTP-centric eyes, and wouldn't play nicely with ELBs and a lot of the other web-based infrastructure we've evolved into :).

seanmonstar commented 10 years ago

Current progress is happening in https://github.com/seanmonstar/gryphon

Once that is fully working, I'll get it plugged into 123done and fxa-profile. Will file a new issue to change /token to /pubkey, and the internals to handle it.

ckarlof commented 10 years ago

I think would be overly aggressive to launch this is as /pubkey only, given our commitment to Marketplace to have this ready in early May. I anticipate we will operate these in parallel and consider the pubkey/signing approach as experimental until we've had the opportunity to debug it, and write the support libraries and documentation. It also deserves at a couple more rounds of RFCs.

renoirb commented 10 years ago

I still think that it would be better to support original OAuth 2... Because "standards". But your argument has its virtues.

If webplatform.org is to use FxA, we will have to have ed25519 methods working in Python and PHP. But you already talked about language binding anyway, so its all good. In our case, we are running some Python projects, plus MediaWiki and other applications that'd need the ability to generate same hashing results.

Fortunately, I found @warner 's Python library, a C library and a C Module for PHP. I compiled the PHP C module and wrote PHP script to see if the module adds the methods, and it does things. Now, i'm unsure about what are the operations you'd want clients to do though.

How about you give a more detailed example of a real-world usage situation (less "foo", please) on what are the operations that you expect to do and give the result outcome. That way, I could validate if the PHP code —and other languages for that matter— will give the same result.

Not to add that this might help future consumers who'd complain about non standard implementation how to ensure that their hashing method is valid. New is frightful, but with some help such as this, might help bite the bullet :)

shane-tomlinson commented 8 years ago

Closing until we come back to this. The history is here.