jpadilla / pyjwt

JSON Web Token implementation in Python
https://pyjwt.readthedocs.io
MIT License
5.08k stars 679 forks source link

Expose get_algorithm_by_name as new method #773

Closed sirosen closed 2 years ago

sirosen commented 2 years ago

Looking up an algorithm by name is used internally for signature generation. This encapsulates that functionality in a dedicated method and adds it to the public API. No new tests are needed to exercise the functionality.

Rationale:

  1. Inside of PyJWS, this improves the code. The KeyError handler is better scoped and the signing code reads more directly.

  2. This is part of the path to supporting OIDC at_hash validation as a use-case (see: #295, #296, #314).

This is arguably sufficient to consider that use-case supported and close it. I believe there is more to do before closing that -- I would like to see Algorithm objects support a method for computing hash digests. However, it is an improvement and step in the right direction in either case.

A minor change was needed to satisfy mypy, as a union-typed variable does not narrow its type based on assignments. The easiest resolution is to use a new name, in this case, simply algorithm -> algorithm_.


Most of the change is in the first commit. In the second commit, I applied get_algorithm_by_name to _verify_signature as well. All tests pass for me locally.

After this change, it becomes possible to do the following for handling the at_hash claim from OIDC (as a client, not a server):

# get data
data = jwt.decode_complete(...)
access_token = ...  # provided separately
payload, header = data["payload"], data["header"]

# get algo object
alg_name = header["alg"]
alg_obj = jwt.get_algorithm_by_name(alg_name)  # <-- new!

# compute at_hash
# works for the hashlib cases, needs more for cryptography cases
digest = alg_obj.hash_alg(access_token).digest()
at_hash = base64.urlsafe_b64encode(digest[:(len(digest) // 2)]).rstrip("=")

# validate / assert
assert at_hash == payload["at_hash"]

I'm open to discussion about the algorithm_ var for mypy. There are several solutions. I prefer this because it preserves the value of type-checking (vs a cast or type-ignore, which effectively disables checking), but it also increases the number of names in that scope, which can lead to confusion.

jrobbins-LiveData commented 11 months ago

Is there some documentation available on jwt.get_algorithm_by_name() and its alg_obj.compute_hash_digest()?

sirosen commented 11 months ago

I did a doc PR back when this merged which added this: https://pyjwt.readthedocs.io/en/stable/usage.html#oidc-login-flow

Probably a nice starting point. There should also be reference doc. EDIT: Search that last bit. I just looked at the site and see that there are gaps in the reference docs. But these methods have docstrings, so whenever that gets fixed they should be picked up.

jrobbins-LiveData commented 11 months ago

That oidc-login-flow is excellent -- thank you!

Not really on this topic (sorry), but in making a pytest unit test, I seem to have hit a snag wherein moto doesn't seem to mock urllib.request.urlopen. It does mock requests. So when PyJWKClient.fetch_data calls urllib.request.urlopen for jwks.json in a test, the call fails. This of course sounds like a moto problem, but, any chance that pyjwt could use requests instead of urllib?