Closed arpit15 closed 1 month ago
Glad you are having fun :)
Yes. I am surprised that the validation logic is different when parsing json and consider the fact that this doesn't work a bug. Finishing something up for the night and will return tomorrow.
edit: it looks like there will have to be a change upstream in pydantic, will raise an issue with them tomorrow which will have more details in it
Thanks for quick response. Looking forward to reading the issue on pydantic. Hopefully there are other workarounds or quick solution which can be merged in pydantic.
An easy hack which works for me is add an Annotated
type over NDArray
from typing import Annotated
from numpydantic import NDArray as _NDArray
from pydantic import BaseModel, AfterValidator
import numpy as np
NDArray = Annotated[_NDArray, AfterValidator(lambda x: np.array(x))]
class MyModel(BaseModel):
array: NDArray
myobj = MyModel(array=[1.0, 2.0])
json_s = myobj.model_dump_json()
loaded_obj = MyModel.model_validate_json(json_s)
assert isinstance(loaded_obj.array, np.ndarray)
Ha, yes :) that should work, though we lose the ability to use the model with other array backends (shape and dtype validation should still work).
The basic problem is that when parsing json, pydantic-core just uses the json schema and not the python validators. The json schema is correct for an n-dimensional array in json (a list of lists, parametrised according to the shape and dtype constraints), so it validates, but we need a way to hook on the coercion parts of the array interfaces at the end of the json parsing. Im going to look further if there's a way to chain a validator for just the json validation, and if not we might have to do some more monkeypatching
Makes sense! I will be happy to test out your changes. Let me know if I can help in any way.
Just letting you know i've figured this out and will issue a patch tonight or tomorrow <3. simpler than i thought, just need to change the way we're generating the json schema on the NDArray class (which we will soon rewrite anyway to make a proper generic, but that's another issue).
edit: for more info - i had misunderstood how json_or_python_schema
worked. Since __get_pydantic_core_schema__
receives the _source_type
but __get_pydantic_json_schema__
doesn't, we generated the json schema then because that's when we have the shape
and dtype
values. But json_or_python_schema
is what is making pydantic use the json schema when revalidating the json. If instead we just generate the json schema in __get_pydantic_json_schema__
it uses the python validator which correctly roundtrips.
@sneakers-the-rat thanks for the explanation. After your explanation, I kinda understand __get_pydantic_json_schema__
and __get_pydantic_json_schema__
. It would be great to have to have it handled by your lib. I am wondering if there is a planned release cycle for numpydantic
.
No planned release cycle, I just fix bugs as they come up and make enhancements as requested at this point, but I am using semver and will do appropriate deprecation warnings in the case of breaking changes. Next major planned version 2.0.0 will be to replace the basic NDArray type with a proper Generic with TypeVarTuple while keeping current behavior, and moving away from nptyping with full removal in 3.0.0, so plenty of warning.
Ill make this patch shortly
Making a note with a checklist item to
for ur consideration: https://github.com/p2p-ld/numpydantic/pull/20 docs: https://numpydantic.readthedocs.io/en/dump_json/serialization.html
Thanks for creating this essential missing piece from pydantic. I want to serialize a bunch of classes containing ndarray and deserialize them such they after initialization the class has elements with type ndarray. However, according to examples, that is not the supported behavior.
I am wondering if there is a way to get this behavior using numpydantic.