AllenNeuralDynamics / aind-data-schema

A library that defines AIND data schema and validates JSON.
MIT License
19 stars 15 forks source link

Upper Bound for pydantic #912

Closed mekhlakapoor closed 4 months ago

mekhlakapoor commented 4 months ago

Pydantic 2.7 has a couple of big changes that we'll need to handle. There's an issue to do this upgrade, but for now pin the version since it looks like this is affecting the metadata-entry

bruno-f-cruz commented 4 months ago

Any chance you can describe the bugs you are finding? Pinning the version to <2.7 will make it impossible for me to use aind-data-schemas as 2.7 fixed important bugs with json-model generation (some of which aind-data-schema benefits from too, see #797)

mekhlakapoor commented 4 months ago

It's changing the way a couple fields are rendering, but we can discuss.

bruno-f-cruz commented 4 months ago

If the stability of rendered fields in your application (which I assume uses the json-schema as the source) is critical (as it was in my case), I would suggest decoupling it from the pydantic default json model generator by creating your own class. You can see an example here: https://github.com/AllenNeuralDynamics/Aind.Behavior.Services/blob/71a036627c6a7feb192d871eb316d87234edbd2c/src/DataSchemas/aind_behavior_services/utils.py#L16

You can then use this as an intermediate step between pydantic and your metadata-entry application. Happy to discuss next week if you think this would help!

mekhlakapoor commented 4 months ago

Maybe as a workaround, pin pydantic version to 2.6 in github action. @jtyoung84 will look into it

jtyoung84 commented 4 months ago

@bruno-f-cruz @saskiad @dyf This version is breaking more things than just the rendered json. I'd rather add an upper bound now and figure out what was changed in version 2.7. We're already having to add the upper bound in other repos that import aind-data-schema.

bruno-f-cruz commented 4 months ago

Ok, I will keep my custom generators for now to make things work on my end. If you have targeted bugs/issues that you found I am happy to take a look to make a push to have 2.7.

jtyoung84 commented 4 months ago

There are a few small things on how we are handling invalid models. I'll have to sit down and sort out what needs to be fixed. As an example, in one of our libraries that we're using to upgrade from old models to newer models, this new behavior broke a few unit tests:

pydantic v2.6

from aind_data_schema.core.data_description import DataDescription
foobar = DataDescription.model_construct(**{"creation_date": "2020-10-10"})
hasattr(foobar, "creation_date")
# True

pydantic 2.7

from aind_data_schema.core.data_description import DataDescription
foobar = DataDescription.model_construct(**{"creation_date": "2020-10-10"})
hasattr(foobar, "creation_date")
# False
bruno-f-cruz commented 4 months ago

Not sure if it helps, but this has to do with the behavior of regarding the deserialization of extra properties. I wonder if you can transiently change the Config attribute to allow your bypass.


from aind_data_schema.core.data_description import DataDescription
from pydantic import BaseModel, ConfigDict

class Allowed(BaseModel):
    model_config = ConfigDict(extra='allow')
    foo: str

class Forbidden(BaseModel):
    model_config = ConfigDict(extra='forbid')
    bar: str

class Ignored(BaseModel):
    model_config = ConfigDict(extra='ignore')
    baz: str

allowed = Allowed.model_construct(**{"creation_date": "2020-10-10"})
forbidden = Forbidden.model_construct(**{"creation_date": "2020-10-10"})
ignored = Ignored.model_construct(**{"creation_date": "2020-10-10"})

print(hasattr(allowed, "creation_date"))
#True
print(hasattr(forbidden, "creation_date"))
#False
print(hasattr(ignored, "creation_date"))
#False

In case it helps here's the original discussion (https://github.com/pydantic/pydantic/issues/8266).

jtyoung84 commented 4 months ago

I think it will most likely require us to update how we are handling the "model_construct" field in a few places. I'd also like to roll out whatever changes we need to our aind-metadata-entry-js app before removing the upper bound.