duo-labs / py_webauthn

Pythonic WebAuthn 🐍
https://duo-labs.github.io/py_webauthn
BSD 3-Clause "New" or "Revised" License
856 stars 171 forks source link

use of attrs breaks compatibility with fastAPI #113

Closed samuelcolvin closed 2 years ago

samuelcolvin commented 2 years ago

Continued from discussion at https://github.com/duo-labs/py_webauthn/pull/111#issuecomment-998796413.

In v1.2.0 this library switched from pydantic to attrs+cattrs.

I can understand why opinionated validation should not be part of a library like this, but switching from pydantic to attrs+cattrs doesn't change that, it just switches from one library to another.

A big down side of this is that it breaks "out of the box" compatibility with FastAPI. FastAPI is the third most popular and fastest growing web framework for python. Pydantic is not just used in FastAPI, there are libraries to support it's use in flask, django and other frameworks. By contracts, while attrs is popular, it's not generally used for external data validation, cattr is not as popular or widely used as either attr or pydantic fwiw.

I propose that we either:

An added advantage of dataclasses is that pydantic has support for validation of dataclasses so py_webauthn would go back to workout out of the box with fastAPI and pydantic. If cattrs is in activate development, it too should support validation of dataclasses, but I can't find anything from a quick search.


Admission: I'm the original developer and maintainer of pydantic

MasterKale commented 2 years ago

Admission: I'm the original developer and maintainer of pydantic

Way to bury the lede 😂

I can understand why opinionated validation should not be part of a library like this, but switching from pydantic to attrs+cattrs doesn't change that, it just switches from one library to another.

Time for a bit of history: use of Pydantic in the original re-release originated internally at Duo, where we're starting to use Pydantic more and more as we push for more comprehensive type declarations on new code. I, being a fan of TypeScript, enjoyed the stronger (runtime) type checking that came with Pydantic, and it also made it easy to recursively rename dict properties from snake_case to camelCase and convert to and from Base64URL to generate JS-like JSON.

When @jwag956 filed Issue #110 noting the library's inability to support bytes-like inputs that might be from DB middleware libraries, I thought the runtime type validation was a step too far, and saw his suggestion of attrs+cattrs as a way to be a better dependency while also making it easy to generate the JS-like JSON output from complex dicts.

And that's the real reason I went with attrs+cattrs versus vanilla dataclasses (which, I agree with you, would have been fine since I was already dropping the requirement for runtime type checking): I didn't want to have to write a bunch of overly-specific or even "clever" to_json() code to take in a dict and recursively do all the property renaming and value coercions. I wanted to incorporate something tried and tested for that instead. Hence my decision to roll with cattrs.

  • I'll help if there are some issues with pydantic or how it's used

Honestly I'd have preferred to stick with Pydantic. The issue neither I nor @jwag956 could work around is how to define a Pydantic class property to support bytes or "a subclass of bytes", as Type[bytes] wasn't sufficient and we couldn't think of any other way without defining custom validators for every bytes property to check that the value passed in was bytes-like. Do you have any suggestions for how to get https://github.com/duo-labs/py_webauthn/blob/master/tests/test_bytes_subclass_support.py to pass when using Pydantic?

jwag956 commented 2 years ago

One big difference I see is that attrs doesn't require/perform any runtime checking, and it is fully integrated with mypy for projects that want static type checking. cattrs is of course just used to make conversions easier. So the use of those libraries only presents a dependency for py_webauthn - not the applications that use py_webauthn. That makes the library much more usable across all python ecosystems.

However really, if I hadn't run into runtime issues using postgres and mongodb, I wouldn't have cared. I did have a workaround - explicitly casting the binary attributes to 'bytes' when reading from DB.

samuelcolvin commented 2 years ago

So the use of those libraries only presents a dependency for py_webauthn - not the applications that use py_webauthn

That's not really true. Currently attrs and cattrs are dependencies of py_webauthn so they get installed when using py_webauthn, also the objects defined by py_webauthn are attrs classes, not dataclasses.

This is not different to pydantic, unless I'm missing something?

jwag956 commented 2 years ago

So, yes, those are install dependencies - however - from a usage perspective - my application/library doesn't care - py_webauthn exports/documents a set of objects/classes and methods on those classes. How those classes are created doesn't matter to me. As a modern python library, py_webauthn also supports type hints... Yay!

py_webauthn is free to change their implementation, as they did, as long as they don't break compatibility - the change from pydantic to attrs didn't require me to change anything. In this regard I consider python's duck-typing similar to other languages 'interfaces'. In this regard, higher level code/frameworks should honor the interface and not require a particular implementation.

phyyou commented 2 years ago

I'm using py_webauthn with FastAPI, so I think it's good to using pydantic for what it's worth...

I propose that we either:

  • switch back to pydantic - I'll help if there are some issues with pydantic or how it's used
  • or, switch to dataclasses which are a standard library, no-validation alternative to attr, cattr and pydantic <-- this would be my recommendation

I'm all for that.

I think pydantic isn't wrong choice, in #110 , it raise exception on bytes subclass.

hmm.. How about new feature subclass of bytes that field type in pydantic.

MasterKale commented 2 years ago

Happy New Year, everyone. Time to get back to this!

Honestly I'd have preferred to stick with Pydantic. The issue neither I nor @jwag956 could work around is how to define a Pydantic class property to support bytes or "a subclass of bytes", as Type[bytes] wasn't sufficient and we couldn't think of any other way without defining custom validators for every bytes property to check that the value passed in was bytes-like. Do you have any suggestions for how to get https://github.com/duo-labs/py_webauthn/blob/master/tests/test_bytes_subclass_support.py to pass when using Pydantic?

@samuelcolvin Might you have any thoughts on this? This was the thing that caused all of this conversation. Are we just missing something about Pydantic, or is there an opportunity for an enhancement to the lib?

MasterKale commented 2 years ago

Two weeks later and no response puts me in a weird place. Pydantic is out because it doesn't support bytes subclasses for bytes properties, and attrs + cattrs breaks support for FastAPI. Time for another rewrite to use pure dataclasses I guess? That's the only path forward I can see without breaking anything else called out in this discussion 😮‍💨

jwag956 commented 2 years ago

Obviously I am not a neutral party :-)

There is no reason you, as the owner/maintainer of a middleware utility package, should change the design and implementation choices you have made in order for some higher level framework to ‘work’ (unless of course you are positioning this library as part of a framework ecosystem).

In this case - auto-generating an API from pywebauthn doesn’t seem overly important - an application that wants to integrate webauthn will need richer APIs than just the minimum data structures that pywebauthn needs. I can say this with some authority since I am integrating pywebauthn into Flask-Security and there is quite a bit of wrapper code required to actually integrate this with various data stores, primary versus secondary authentication, etc.

FastApi is an opinionated framework, just as Flask and Django are. That’s fine - but the way to deal with them is to have thin glue layers that take generic libraries and customize for the given framework. If you look at Flask for example - many flask_xxx modules are very small layers (look at Flask-SQLAlchemy versus SQLAlchemy).

For the record - I have been using pywebauthn for a couple months now - it is awesome and a great piece of code. Thanks.

samuelcolvin commented 2 years ago

Sorry for slow reply, I'll look today.

samuelcolvin commented 2 years ago

Are we just missing something about Pydantic, or is there an opportunity for an enhancement to the lib?

In short, yes. pydantic has fairly good documentation on custom data types, they cover this case very easily:

from pydantic import BaseModel
from pydantic.validators import bytes_validator

class CleverBytes(bytes):
    """
    Custom type to use as an annotation in pydantic models.

    Accepts either bytes OR a subclass of bytes, or all the other stuff that work with bytes, e.g. strings
    """
    @classmethod
    def __get_validators__(cls):
        yield cls.validate

    @classmethod
    def validate(cls, v):
        if isinstance(v, bytes):
            return v
        else:
            return bytes_validator(v)

class MyModel(BaseModel):
    my_bytes: CleverBytes

class BytesSubclass(bytes):
    def __new__(cls, data: bytes):
        self = bytes.__new__(cls, memoryview(data).tobytes())
        return self

m1 = MyModel(my_bytes=b'normal bytes')
print(m1)
assert m1.my_bytes.__class__ == bytes

m2 = MyModel(my_bytes=BytesSubclass(b'a subclass of bytes'))
print(m2)
assert m2.my_bytes.__class__ == BytesSubclass

m3 = MyModel(my_bytes='a string which will be coersed to bytes (see line 20 above)')
print(m3)
assert m3.my_bytes.__class__ == bytes

auto-generating an API from pywebauthn doesn’t seem overly important - an application that wants to integrate webauthn will need richer APIs than just the minimum data structures that pywebauthn needs. I can say this with some authority since I am integrating pywebauthn into Flask-Security and there is quite a bit of wrapper code required to actually integrate this with various data stores, primary versus secondary authentication, etc.

That's not correct. I have a platform in production which uses webauthn and makes direct use of py_webauthn types (I'm using v1.1.0). Make you should think about using a more modern web framework? :wink:

samuelcolvin commented 2 years ago

The plot thickens, I've just been digging through pydantic trying to work out why the standard bytes validator isn't behaving the same as the above.

I think the actual problem is with cython which pydantic is generally compiled with when you install from pypi. I think the type annotation on the bytes_validator method is causing cython to convert the BytesSubclass to bytes. We should be able to fix this in pydantic fairly quickly.

MasterKale commented 2 years ago
class CleverBytes(bytes):
    """
    Custom type to use as an annotation in pydantic models.

    Accepts either bytes OR a subclass of bytes, or all the other stuff that work with bytes, e.g. strings
    """
    @classmethod
    def __get_validators__(cls):
        yield cls.validate

    @classmethod
    def validate(cls, v):
        if isinstance(v, bytes):
            return v
        else:
            return bytes_validator(v)

Huh, so I could have avoided all that work in #111 with this single class definition and switching bytes to CleverBytes in my Pydantic model definitions? And everything would have been fine?

(BTW nothing comes up when I search for things like "bytes_validator", "str_validator", etc... on https://pydantic-docs.helpmanual.io/, btw - I dunno how I'd have discovered this functionality without your example above 😱 )

samuelcolvin commented 2 years ago

And everything would have been fine?

I guess so.

dunno how I'd have discovered this functionality without your example above

ye, I get that.

I think it's really a bug in pydantic that subclasses are not accepted, I'm fixing it at https://github.com/samuelcolvin/pydantic/pull/3707

I understand what you mean about the documentation, but (while the documentation could always be better) I don't think we could ever provide documentation that would have given a result for your query. You're more likely to have luck googling it and finding someone else who's had the same problem. Or creating an issue/discussion and hoping someone answers - I really don't have the time right now to reply to all questions.

MasterKale commented 2 years ago

I think it's really a bug in pydantic that subclasses are not accepted, I'm fixing it at samuelcolvin/pydantic#3707

Thank you for looking into this, I'll follow along and keep my fingers crossed for you.

I understand what you mean about the documentation, but (while the documentation could always be better) I don't think we could ever provide documentation that would have given a result for your query.

Are these preset validators in pydantic.validators touched on anywhere and I'm just missing it? Looking into this file there's a lot of good stuff in here, particularly strict_ versions of validators like strict_bytes_validator, but in a "unknown unknowns" kind of way I'd never have known to ask for such things after reading the docs when I initially incorporated Pydantic 🤔

I take your point about not having time to answer everyone's questions, though. In the future I'll be more keen on posting an issue to the repo and seeing what suggestions come up.

MasterKale commented 2 years ago

Alright, I'd previously proposed moving "down" to dataclasses. As of this morning, though, with @samuelcolvin's CleverBytes example fresh in my mind I'm now seeing a path to going back to Pydantic while also supporting @jwag956's use case. This would essentially involve reverting #111 and then replacing basic bytes types with something like CleverBytes. I think in the end everyone can be happy:

Sound like a plan?

samuelcolvin commented 2 years ago

That sounds wonderful. Thank you.

And sorry again for the extra work caused.

MasterKale commented 2 years ago

It took me a while but I finally found time to revert the attrs+cattrs work as PR #117. I'll sleep on this over the weekend and probably merge and release a beta on Monday.

@jwag956 would you be willing to test things out next week? 😛

MasterKale commented 2 years ago

webauthn==1.3.0 has been successfully published to restore use of Pydantic while also continuing to support bytes-like values when instantiating WebAuthnBaseModel values for the various methods. 😌