ulrikstrid / ocaml-jose

https://ulrikstrid.github.io/ocaml-jose/
MIT License
53 stars 4 forks source link

thumbprint returns string? #43

Closed Lupus closed 2 years ago

Lupus commented 2 years ago

I find it quite confusing that thumbprint returns a string, is that hex-encoded, base64-encoded, or that string is binary? Looks like the latter glancing at the source code. Shouldn't it be bytes instead? That clearly tells the user that it's hash digest itself, not encoded.

ulrikstrid commented 2 years ago

Since we're doing Cstruct.to_string for every thumbprint, would you want us to just return the Cstruct.t instead?

Lupus commented 2 years ago

There's also Cstruct.to_bytes 🤔 hex package accepts bytes as input, while base64 works with string only. Maybe Thumbprint.t, which will be Cstruct.t under the hood with Thumbprint.to_hex_string, Thumbprint.to_base64_string, Thumbprint.to_bytes, Thumbprint.to_cstruct? Or that's over-engineering? 😄

ulrikstrid commented 2 years ago

So I have not used the thumbprint, but if you have a clear usecase for it I would want to support it. But I agree that it's currently not clear what it is. I didn't see a clear explanation of what it is supposed to be serialised as when skimming the RFC again, might have to read it closer to find out.

Lupus commented 2 years ago

I want to use thumbprint as primary id for public key in the database, binary representation does not work for that, so I want to base64 encode it. The same I'm doing in Go client, so that the same id is known to the client out of the key.

RFC does not specify the serialization of the hash and that makes sense. It specifies how exactly stuff gets hashed (and what stuff out of JWK which has a lot of different defined properties), and as long as parties agree on algorithm and digest encoding - they have the same values.

Probably at the bare minimum it will help the users of Jose.Jwk.get_thumbprint if there is a documentation attached, that says that the resulting string is binary digest produced by the hash function of choice. Glancing at onlye string output type, I was expecting some hex encoded string, tried printing it, then went looking at the source and finally applied base64 encoding.

ulrikstrid commented 2 years ago

The kid (key id) member is already supposed to be (in this implementation) the thumbprint base64 encoded I think so you could just use that instead? The bad thing about using the kid is that there's no standard for creating them, but the good thing is that they are a standard part of the JWK. What do you think about that?

I think thumbprint is supposed to follow the x5t spec in this case, so in either case I need to change it.

Lupus commented 2 years ago

kid can be pretty much arbitrary, no? I don't want to trust client to provide the right kid, I want to infer it from the public key itself. Client is in Golang, no idea what convention they settled on if it's not specified in RFC.

I think thumbprint is supposed to follow the x5t spec in this case, so in either case I need to change it.

You mean how it's encoded for textual representation? It's calculated correctly right now, Golang implementation gets the same digest for the same key, and also exposes it as byte array (i.e. encode it yourself the way you like).

ulrikstrid commented 2 years ago

I think the right choice is to change it so that we get the raw bytes and then we can use it internally for kid and x5t.

PR welcome or I'll do it a bit later.

Thanks for a good discussion about a real world usecase

Lupus commented 2 years ago

After #47 thumbprints are returning base64 encoded strings as it seems? Probably returning Cstruct.t would be the least confusing for the users indeed :)