Question regarding typing and mypy

sambarnes commented 3 years ago

fhir.resources version: 6.1.0
Python version: 3.7.5
Operating System: Ubuntu 18.04

Description

Hey there! Thanks for all the great work on this package! Looking to start using this pretty heavily internally, so many niceties come out of the box with pydantic-based libraries like this.

One slight hiccup I'm seeing in the dev experience is the mismatch between the type annotation and the actual underlying type at runtime.

For example, the fhir.resources.patient.Patient model has a field type of address: typing.List[fhirtypes.AddressType] where AddressType looked like a wrapper around plain dict. Initially I thought this was a little weird, but maybe makes sense if we wanted to postpone validation until that specific field was used and unpacked into its corresponding pydantic type.

When I went to implement our address functionality, I used logic to the effect of address_pydantic = Address(**patient.address[0]) which threw the following error:

TypeError: ModelMetaclass object argument after ** must be a mapping, not Address

So at runtime, the real type of patient.address is List[fhir.resources.address.Address] rather than the dict based type provided in the annotations.

This unfortunately poses an issue for typecheckers such as mypy. The typechecker expects the value to be an AbstractType during runtime, so it won't allow the usage of pydantic fields (for example patient.address[0].state) unless we put cast( calls all over the place, opening ourselves up to type-related bugs that we were trying to avoid by using a pydantic-based lib.

Question

Would you mind expanding on this design decision to have the AbstractType annotations on each of the pydantic type's fields, instead of the actual pydantic type that is seen at runtime? Does anyone else use typecheckers in combination with this library, without needing to cast everywhere? (Certainly open to using a different type checking system as well if this is just a limitation of mypy)

There very well could be information out there on this topic already, however I was not able to find it in the docs. Let me know if I missed it or if my library usage above does not align with what is expected!

nazrulworld commented 3 years ago

First of all, thank you so much for your nice question and welcome. AddressType is derived from AbstractType (derived from dict). Yes theoretically we can say that AddressType is looked like a plain dict wrapper, but the behavior is completely different. The AddressType is pydantic field type which is evaluating and validating input data and return Address object! In your case if you did address_pydantic = Address(**patient.address[0].dict())

Would you mind expanding on this design decision to have the AbstractType annotations on each of the pydantic type's fields, instead of the actual pydantic type that is seen at runtime?

Can you please make more elaborate your suggestions with example. It seems interesting to me.

sambarnes commented 3 years ago

I really appreciate the prompt response!

After posting the question, I did a little more reading of the internals and found what you described above. I now see that it's the AddressType performing the validation and returning a new Address. I think I have a better idea of how it works now.

In the code that you've posted above address_pydantic = Address(**patient.address[0].dict()), mypy raises the following error:

$ mypy --config-file mypy.ini .
error: "AddressType" has no attribute "dict"
Found 1 error in 1 file (checked 50 source files)

Due to the fact that Patient.address is type-annotated as a List[AddressType] rather than List[Address]. As far as using the returned Address model, that seems to work great so far! The only issue is with type checking. For example, say I want to print the state the patient lives in:

print(patient.address[0].state)

mypy then complains with:

$ mypy --config-file mypy.ini .
error: "AddressType" has no attribute "state"
Found 1 error in 1 file (checked 50 source files)

This means I have to put casts in the code instead of offloading all typing guarantees to pydantic:

from typing import cast

address_pydantic = cast(patient.address[0])
print(address_pydantic.state)

(No mypy issues with the above)

My question (and again this may be naive as I'm not a pydantic expert at this point) is why you've chosen to separate the validators from the returned model? It seems to me that if the AddressType's validator function def __get_validators__(cls) -> "CallableGenerator": was implemented on the actual class Address(element.Element): we would be able to have type hints that are accurate to the field's type at runtime. Thus, removing the need to perform casts to satisfy mypy.

In the pydantic docs they show an example of __get_validators__(cls) being implemented on the actual type returned (similar to what I'm trying to describe above). Was there some limitation of that pattern that motivated the decision to have an AbstractType do validation instead of the returned type doing it's own validation?

I hope this clears up my question a bit, but if not no worries! Happy to clarify more if needed :)

nazrulworld commented 3 years ago

Thanks a lot for your clarification! 💯 The short answer to your question is that to avoid the python circular dependency import problem.

Honestly speaking I had also thoughts like you, but cannot find a good solution than the current one. FHIR resources relationships are a little bit complex and two-way (kind of circular) in some cases. One example I can give you https://github.com/nazrulworld/fhir.resources/blob/master/fhir/resources/element.py and https://github.com/nazrulworld/fhir.resources/blob/master/fhir/resources/extension.py You see that Extension class derived from Element class but Element class has field list Extension!

sambarnes commented 3 years ago

Ahh that makes total sense! I had a suspicion that the complex relationships might create circular dependencies.

Appreciate the insight you've been able to provide! We'll just continue to cast for now, and maybe investigate if a small mypy plugin can be written to automatically cast to the right type during typechecks

Closing :)

nazrulworld / fhir.resources

Question regarding typing and mypy #58

Description

Question