pydantic / pydantic

Data validation using Python type hints
https://docs.pydantic.dev
MIT License
21.13k stars 1.9k forks source link

NotImplementedError: Cannot check isinstance when validating from json, use a JsonOrPython validator instead. #8890

Closed rbavery closed 1 month ago

rbavery commented 8 months ago

Initial Checks

Description

I'm attempting to save out an instance of my Pydantic model to json but encountering an error with no info in the traceback.

https://github.com/crim-ca/dlm-extension/blob/41cf8eaa768f41e988ddd693688fc9d795d58cf6/tests/test_schema.py#L11

the following fails with

In [1]: from stac_model.examples import eurosat_resnet
   ...: from stac_model.schema import MLModel
   ...: model_metadata = eurosat_resnet()
   ...: metadata_json = model_metadata.model_dump_json(indent=2)
   ...: model_metadata_validated = MLModel.model_validate_json(metadata_json)
---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
Cell In[1], line 5
      3 model_metadata = eurosat_resnet()
      4 metadata_json = model_metadata.model_dump_json(indent=2)
----> 5 model_metadata_validated = MLModel.model_validate_json(metadata_json)

File ~/miniforge3/lib/python3.10/site-packages/pydantic/main.py:538, in BaseModel.model_validate_json(cls, json_data, strict, context)
    536 # `__tracebackhide__` tells pytest and some other tools to omit this function from tracebacks
    537 __tracebackhide__ = True
--> 538 return cls.__pydantic_validator__.validate_json(json_data, strict=strict, context=context)

this error is likely related to https://github.com/pydantic/pydantic/issues/8455 but I couldn't fix this by following this comment's guidance https://github.com/pydantic/pydantic/issues/8455#issuecomment-1873460962

I tried removing the only isinstance check I could find here and it didn't have any effect on the error.

Example Code

the package I'm working with can be installed with

pip install stac-model==0.1.1.alpha3

then the error can be reproduced with

from stac_model.examples import eurosat_resnet
from stac_model.schema import MLModel
model_metadata = eurosat_resnet()
metadata_json = model_metadata.model_dump_json(indent=2)
model_metadata_validated = MLModel.model_validate_json(metadata_json)

the model is defined here: https://github.com/crim-ca/dlm-extension/tree/21d1aa9947befc8fb2a46458ef84e9d183d50f48/stac_model

Python, Pydantic & OS Version

→ poetry run python -c "import pydantic.version; print(pydantic.version.version_info())"      
             pydantic version: 2.6.2
        pydantic-core version: 2.16.3
          pydantic-core build: profile=release pgo=true
                 install path: /home/rave/.cache/pypoetry/virtualenvs/stac-model-YQXQDXJF-py3.10/lib/python3.10/site-packages/pydantic
               python version: 3.10.12 | packaged by conda-forge | (main, Jun 23 2023, 22:40:32) [GCC 12.3.0]
                     platform: Linux-6.5.0-21-generic-x86_64-with-glibc2.35
             related packages: typing_extensions-4.10.0 mypy-1.0.1
                       commit: unknown
rbavery commented 8 months ago

downgrading to provided a more informative traceback that showed I was missing Optional type annotations on some fields.

  • Downgrading pydantic-core (2.16.3 -> 2.6.3)
  • Downgrading pydantic (2.6.2 -> 2.3.0)

When I addressed these issues I could run model_validate_json but only with pydantic 2.3. I get the same error with the Optional typing issues addressed when I tried to use version 2.6.3 again.

sydney-runkle commented 8 months ago

@rbavery,

Could you provide a MRE for this issue? I'm not able to view the commit that you linked with the model. A boiled down version of this issue will help us to address your problem! Thanks.

rbavery commented 8 months ago

For me this week, making an MRE would be too large of a project, I'm not sure how long it would take to reproduce the issue without the library I'm using. Since I don't know what causes the issue in the first place (there's no information in the traceback about where the issue occurred).

I just noticed the formatting was off in my original post, sorry about that.

My package source is here: https://github.com/rbavery/dlm-extension/tree/validate/stac_model

and it is pip installable and testable with the code provided above

the package I'm working with can be installed with

pip install stac-model==0.1.1.alpha3

then the error can be reproduced with

from stac_model.examples import eurosat_resnet
from stac_model.schema import MLModel
model_metadata = eurosat_resnet()
metadata_json = model_metadata.model_dump_json(indent=2)
model_metadata_validated = MLModel.model_validate_json(metadata_json)
sydney-runkle commented 8 months ago

@rbavery,

If we don't have a purely pydantic MRE, I don't think it makes sense to have this issue open here. I'd recommend opening an issue on the stac_model repo!

Thanks!

rbavery commented 8 months ago

@sydney-runkle I'm the maintainer of stac-model, and the primary dependency is pydantic.

https://github.com/rbavery/dlm-extension/blob/validate/pyproject.toml#L58

rbavery commented 8 months ago

@sydney-runkle it actually looks like I provided a minimal repro back in December, sorry for forgetting to raise this in a new issue: https://github.com/pydantic/pydantic/issues/8189#issuecomment-1848226486

I just tested this with pydantic 2.3.2 and the issue still occurs

# rave at rave-desktop in ~/work/dlm-extension on git:validate ✖︎ [11:21:33]
→ pip install pydantic=="2.6.3"       
Collecting pydantic==2.6.3
  Obtaining dependency information for pydantic==2.6.3 from https://files.pythonhosted.org/packages/ac/86/c98520827f58c8753783be4bf2286b4f73a18ac71c93ab597ae1aeb26fc8/pydantic-2.6.3-py3-none-any.whl.metadata
  Downloading pydantic-2.6.3-py3-none-any.whl.metadata (84 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 84.4/84.4 kB 1.4 MB/s eta 0:00:00
Requirement already satisfied: annotated-types>=0.4.0 in /home/rave/miniforge3/lib/python3.10/site-packages (from pydantic==2.6.3) (0.6.0)
Requirement already satisfied: pydantic-core==2.16.3 in /home/rave/miniforge3/lib/python3.10/site-packages (from pydantic==2.6.3) (2.16.3)
Requirement already satisfied: typing-extensions>=4.6.1 in /home/rave/.local/lib/python3.10/site-packages (from pydantic==2.6.3) (4.8.0)
Downloading pydantic-2.6.3-py3-none-any.whl (395 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 395.2/395.2 kB 3.1 MB/s eta 0:00:00
Installing collected packages: pydantic
  Attempting uninstall: pydantic
    Found existing installation: pydantic 2.3.0
    Uninstalling pydantic-2.3.0:
      Successfully uninstalled pydantic-2.3.0
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
fastapi 0.104.1 requires anyio<4.0.0,>=3.7.1, but you have anyio 4.2.0 which is incompatible.
lightly 1.4.23 requires pydantic<2,>=1.10.5, but you have pydantic 2.6.3 which is incompatible.
Successfully installed pydantic-2.6.2
(base) 
# rave at rave-desktop in ~/work/dlm-extension on git:validate ✖︎ [11:24:18]
→ ipython                      
Python 3.10.12 | packaged by conda-forge | (main, Jun 23 2023, 22:40:32) [GCC 12.3.0]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.18.1 -- An enhanced Interactive Python. Type '?' for help.

   ...:     def validate_key(cls, v):
   ...:         if '//' in v:
   ...:             raise ValueError('Key must not contain double slashes')
   ...:         return v.strip('/')
   ...: 
   ...: class ModelArtifact(BaseModel):
   ...:     path: S3Path | FilePath | str = Field(...)
   ...:     additional_files: Optional[Dict[str, FilePath]] = None
   ...: 
   ...:     class Config:
   ...:         arbitrary_types_allowed = True
   ...: 
   ...: class ClassMap(BaseModel):
   ...:     class_to_label_id: Dict[str, int]
   ...: 
   ...:     @property
   ...:     def label_id_to_class(self) -> Dict[int, str]:
   ...:         return {v: k for k, v in self.class_to_label_id.items()}
   ...: 
   ...: class ModelMetadata(BaseModel):
   ...:     signatures: ModelSignature
   ...:     artifact: ModelArtifact
   ...:     id: str = Field(default_factory=lambda: uuid4().hex)
   ...:     class_map: ClassMap
   ...:     runtime_config: Optional[RuntimeConfig] = None
   ...:     name: str
   ...:     ml_model_type: Optional[str] = None
   ...:     ml_model_processor_type: Optional[Literal["cpu", "gpu", "tpu", "mps"]] = None
   ...:     ml_model_learning_approach: Optional[str] = None
   ...:     ml_model_prediction_type: Optional[Literal["object-detection", "classification", "segmentation", "regression"]] = None
   ...:     ml_model_architecture: Optional[str] = None
   ...: 
   ...:     class Config:
   ...:         arbitrary_types_allowed = True
   ...: 
   ...: # Functions to create, serialize, and deserialize ModelMetadata
   ...: def create_metadata():
   ...:     input_sig = TensorSignature(name='input_tensor', dtype='float32', shape=(-1, 13, 64, 64))
   ...:     output_sig = TensorSignature(name='output_tensor', dtype='float32', shape=(-1, 10))
   ...:     model_sig = ModelSignature(inputs=[input_sig], outputs=[output_sig])
   ...:     model_artifact = ModelArtifact(path="s3://example/s3/uri/model.pt")
   ...:     class_map = ClassMap(class_to_label_id={
   ...:         'Annual Crop': 0, 'Forest': 1, 'Herbaceous Vegetation': 2, 'Highway': 3,
   ...:         'Industrial Buildings': 4, 'Pasture': 5, 'Permanent Crop': 6,
   ...:         'Residential Buildings': 7, 'River': 8, 'SeaLake': 9
   ...:     })
   ...:     return ModelMetadata(name="eurosat", class_map=class_map, signatures=model_sig, artifact=model_artifact, ml_model_processor_type="cpu")
   ...: 
   ...: def metadata_json(metadata: ModelMetadata) -> str:
   ...:     return metadata.model_dump_json(indent=2)
   ...: 
   ...: def model_metadata_json_operations(json_str: str) -> ModelMetadata:
   ...:     return ModelMetadata.model_validate_json(json_str)
   ...: 
   ...: # Running the functions end-to-end
   ...: metadata = create_metadata()
   ...: json_str = metadata_json(metadata)
   ...: model_metadata = model_metadata_json_operations(json_str)
   ...: 
   ...: print("Model Metadata Name:", model_metadata.name)
---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
Cell In[1], line 100
     98 metadata = create_metadata()
     99 json_str = metadata_json(metadata)
--> 100 model_metadata = model_metadata_json_operations(json_str)
    102 print("Model Metadata Name:", model_metadata.name)

Cell In[1], line 95, in model_metadata_json_operations(json_str)
     94 def model_metadata_json_operations(json_str: str) -> ModelMetadata:
---> 95     return ModelMetadata.model_validate_json(json_str)

File ~/miniforge3/lib/python3.10/site-packages/pydantic/main.py:538, in BaseModel.model_validate_json(cls, json_data, strict, context)
    536 # `__tracebackhide__` tells pytest and some other tools to omit this function from tracebacks
    537 __tracebackhide__ = True
--> 538 return cls.__pydantic_validator__.validate_json(json_data, strict=strict, context=context)

NotImplementedError: Cannot check isinstance when validating from json, use a JsonOrPython validator instead.
sydney-runkle commented 5 months ago

Hi @rbavery,

Apologies for not following up earlier here - feel free to let me know if you're still experiencing this issue in our newest version, and I'd be happy to help!

UlfurOrn commented 1 month ago

Hey 👋 I ran into the same error message while using lambda powertools. E.g.

NotImplementedError: Cannot check issubclass when validating from json, use a JsonOrPython validator instead.

In this case it seems to be caused when a model with a field using the type type[BaseModel] is parsed from JSON.

Heres a minimal example:

import json

from pydantic import BaseModel

class MyWorkingModel(BaseModel):
    string: str
    submodel: dict[str, Any] | BaseModel

class MyFailingModel(BaseModel):
    string: str
    submodel: dict[str, Any] | BaseModel | type[BaseModel]

data = {
    "string": "string",
    "submodel": {"key": "value"},
}

MyWorkingModel.model_validate_json(json.dumps(data))  # works
MyFailingModel.model_validate_json(json.dumps(data))  # fails 

I've already filed an issue to the Lambda Powertools repo here.

But I wanted to mention this here in case someone else is running into a similar issue (or if this is something that Pydantic wants to improve?).

sydney-runkle commented 1 month ago

Hi,

Thanks for the reproducible example! This is on us - will try to fix here soon.