Closed mekhlakapoor closed 4 months ago
Any chance you can describe the bugs you are finding? Pinning the version to <2.7 will make it impossible for me to use aind-data-schemas as 2.7 fixed important bugs with json-model generation (some of which aind-data-schema benefits from too, see #797)
It's changing the way a couple fields are rendering, but we can discuss.
If the stability of rendered fields in your application (which I assume uses the json-schema as the source) is critical (as it was in my case), I would suggest decoupling it from the pydantic default json model generator by creating your own class. You can see an example here: https://github.com/AllenNeuralDynamics/Aind.Behavior.Services/blob/71a036627c6a7feb192d871eb316d87234edbd2c/src/DataSchemas/aind_behavior_services/utils.py#L16
You can then use this as an intermediate step between pydantic and your metadata-entry application. Happy to discuss next week if you think this would help!
Maybe as a workaround, pin pydantic version to 2.6 in github action. @jtyoung84 will look into it
@bruno-f-cruz @saskiad @dyf This version is breaking more things than just the rendered json. I'd rather add an upper bound now and figure out what was changed in version 2.7. We're already having to add the upper bound in other repos that import aind-data-schema.
Ok, I will keep my custom generators for now to make things work on my end. If you have targeted bugs/issues that you found I am happy to take a look to make a push to have 2.7.
There are a few small things on how we are handling invalid models. I'll have to sit down and sort out what needs to be fixed. As an example, in one of our libraries that we're using to upgrade from old models to newer models, this new behavior broke a few unit tests:
pydantic v2.6
from aind_data_schema.core.data_description import DataDescription
foobar = DataDescription.model_construct(**{"creation_date": "2020-10-10"})
hasattr(foobar, "creation_date")
# True
pydantic 2.7
from aind_data_schema.core.data_description import DataDescription
foobar = DataDescription.model_construct(**{"creation_date": "2020-10-10"})
hasattr(foobar, "creation_date")
# False
Not sure if it helps, but this has to do with the behavior of regarding the deserialization of extra properties. I wonder if you can transiently change the Config attribute to allow your bypass.
from aind_data_schema.core.data_description import DataDescription
from pydantic import BaseModel, ConfigDict
class Allowed(BaseModel):
model_config = ConfigDict(extra='allow')
foo: str
class Forbidden(BaseModel):
model_config = ConfigDict(extra='forbid')
bar: str
class Ignored(BaseModel):
model_config = ConfigDict(extra='ignore')
baz: str
allowed = Allowed.model_construct(**{"creation_date": "2020-10-10"})
forbidden = Forbidden.model_construct(**{"creation_date": "2020-10-10"})
ignored = Ignored.model_construct(**{"creation_date": "2020-10-10"})
print(hasattr(allowed, "creation_date"))
#True
print(hasattr(forbidden, "creation_date"))
#False
print(hasattr(ignored, "creation_date"))
#False
In case it helps here's the original discussion (https://github.com/pydantic/pydantic/issues/8266).
I think it will most likely require us to update how we are handling the "model_construct" field in a few places. I'd also like to roll out whatever changes we need to our aind-metadata-entry-js app before removing the upper bound.
Pydantic 2.7 has a couple of big changes that we'll need to handle. There's an issue to do this upgrade, but for now pin the version since it looks like this is affecting the metadata-entry