duo-labs / py_webauthn

Pythonic WebAuthn 🐍
https://duo-labs.github.io/py_webauthn
BSD 3-Clause "New" or "Revised" License
856 stars 171 forks source link

Fix handling of optional fields such as user_handle. #180

Closed jwag956 closed 9 months ago

jwag956 commented 1 year ago

UserHandle is tagged as Optional[bytes] so field.annotation == bytes doesn't match - add explicit match for user_handle.

The conversion from base64 should only be done if input was json - there was a comment to this effect that that didn't work when using parse_raw - but recent releases, using Pydantic 2.0 no longer use those deprecated methods.

closes #174

MasterKale commented 1 year ago

So userHandle is an interesting one in the spec. user_id, which this library types as str in generate_registration_options(), is JSON-serializable as-is so it doesn't undergo any transformation into base64url. If you then decode the string as UTF-8 to bytes on the front end during registration, just before you pass it in as options.user.id when calling .create(), then during authentication the value of userHandle will be bytes you can encode back to UTF-8 to get the exact same value you specified as user_id during registration.

I found two years ago, from a developer ergonomics story, that it's much better to not base64url-encode/decode user.id and userHandle and use UTF-8 instead. There's some more context here, showing my work as it were, over in SimpleWebAuthn:

https://github.com/MasterKale/SimpleWebAuthn/pull/120

The problem over here in py_webauthn land is that I designed this API to pair well with @simplewebauthn/browser methods that use UTF-8 encoding/decoding on the front end specifically for user.id and userHandle. I think perhaps what's missing from this library is some kind of documentation about this behavior, but I don't have as comprehensive a documentation site (and am not sure when I'd ever have the time to spin one up) to convey insights like this.

I think that leaves docstrings as an option here as clearly the use of UTF-8 is non-obvious to anyone but me 🫠

jwag956 commented 1 year ago

I have looked at the spec over and over - and still get confused. One thing that really confused me was this section: https://www.w3.org/TR/webauthn-2/#sctn-automation-add-credential which talks about user-agent automation - and explicitly says to b64encoded - but that's an extension.

I don't really care either way - as long as it's documented. With pydantic V1 - it is explicitly called out as b64encoded in _object_hook_base64url_to_bytes when decoding JSON (along with a bunch of other fields).

With pydantic V2 - since it is looking at the python typing information, it is looking for ALL fields that are bytes, and any 'string' is b64decoded - the reason user_handle isn't being decoded is JUST because it is typed as Optional[bytes] and trying to test for 'Optional' is difficult (I googled it).

In my first PR - I was overly aggressive - but felt that using python types as a proxy for how the fields are encoded seems fragile - as opposed to how you did in with pydantic V1 - where you explicitly coded which fields should be b64encoded.

In any case - what I think is missing is explicit documentation for the wire protocol that py_webauthn accepts and emits - which would allow front ends and RPs to be written.

If you feel like with the move to pydanticV2 and 1.11 release that user_handle should be raw bytes - I can easily change my front end and RP.

Finally - and the reason I initially filed the issue is that as the code currently is - there is a difference of how user_handle is managed depending on whether the application uses pydanticV1 or V2 - so effectively the wire protocol that py_webauthn exposes is different (and not backwards compatible) - that should be documented - and if you think it would be better to have user_handle be UTF-8 - this would be a great time to break compatibility!

jwag956 commented 1 year ago

Ok - I hopefully fixed mypy - I really believe that supporting standard JSON from front-end to RP is important - and therefore each of the fields passed needs to be json-serializable - and b64urlencode is common. So webauthn really should support that for user_handle.

jwag956 commented 11 months ago

Hello! any chance we could either get this in or figure out how to solve this regression when moving to pydantic v2?

Thanks!

MasterKale commented 11 months ago

You know what, I just realized that I've been base64url-encoding user.id the entire time, which means userHandle is coming back as the base64url-encoded version, not the string passed in as user_id into generate_registration_options(). I thought I'd fixed that... 🤦‍♂️

Let me look again at what you're proposing in this PR, I thought it was changing this behavior but you're only trying to ensure that user_handle gets consistently handled. Sorry for the delay, I'll make time to take a look tomorrow ✌️

MasterKale commented 9 months ago

Thank you for taking the time to submit and talk about this PR @jwag956. I've merged #195 which removed Pydantic as a dependency of this project so I'm closing out any PR's and issues related to Pydantic as they're no longer needed.