Indicio-tech / pydid

Python library for validating, constructing, and representing DIDs and DID Documents
Apache License 2.0
11 stars 12 forks source link

:arrow_up: Upgrade pydantic to v2 #72

Closed ff137 closed 5 months ago

ff137 commented 1 year ago

:construction: work in progress - converted all the basic, deprecated methods, with the help of bump-pydantic

Currently struggling to get deserialization to work. Will update with progress soon

ff137 commented 5 months ago

@dbluhm I think I have done it 👀

Just as I was nearing the end of my rope, with 50 tests failing because DIDDocuments couldn't serialize ... when all seemed lost, I realised that for the new pydantic model_validators, the "before" and "after" setting has a big impact on the types being handled.

I'll spare most of the detail, but basically you don't want the "after" validators (@model_validator(mode="after")) to return dictionaries, but it's okay if the "before" validators do, because pydantic's internal validator will convert the dict to the model. So in the "before" validators, handling for dictionary objects as follows makes sense:

        if not isinstance(values, dict):
            values = values.__dict__

(because some of the existing validator logic wants to iterate over values.items()) But doing so does not make sense for the "after" validators, because they have already been internally validated and will always be a model type. So my after validators were return dictionaries, messing things up.

That was the breakthrough that solved all the DIDDocuments not being able to serialize, and then there were just a handful of test failures that could more easily be traced down. That initial breakthrough did take a lot of persistence though, with many hours of head scratching.

There were quite a few places where I just guessed what's the best solution -- so please do review thoroughly! For example, the VerificationMethod has a set of material_properties, and there's a validator which asserts that no more than one material property is provided. This failed when the input dictionary was using the camelCase alias for the field names. So, I just added a camel case version of the set of material properties, so the validator now checks for both. Not sure if there's a better way, to utilise pydantic to handle camel case vs snake case, but hey, it seems to work.

~Also, one seemingly big change is that I couldn't get dereference_as to work with KnownVerificationMethods. Because KnownVerificationMethods is essentially a long union list of accepted types. I have no clue how to get pydantic to validate an arbitrary input to the correct type, using the new type adapters. There doesn't seem to be a strict requirement for that in the code itself, there's just one test which tried that. So, I just changed the test to dereference to the expected type.. But it may be a breaking change if any libraries use Resource(..).dereference_as(KnownVerificationMethods).~ Edit: I was mistaken. Deserialising to KnownVerificationMethods still works 😄

ff137 commented 5 months ago

As luck would have it, the 1.74% of uncovered code seems to contain a bug! Investigating

ff137 commented 5 months ago

Well, I'm not quite sure why, but Pydantic is complaining about a v1 method being used:

  File "/home/aries/.local/lib/python3.9/site-packages/pydid/service.py", line 16, in <module>
    class Service(Resource):
  File "/usr/local/lib/python3.9/site-packages/pydantic/_internal/_model_construction.py", line 202, in __new__
    complete_model_class(
...
  File "/usr/local/lib/python3.9/site-packages/pydantic/_internal/_generate_schema.py", line 1804, in inner_handler
    metadata_js_function = _extract_get_pydantic_json_schema(obj, schema)
  File "/usr/local/lib/python3.9/site-packages/pydantic/_internal/_generate_schema.py", line 2157, in _extract_get_pydantic_json_schema
    raise PydanticUserError(
pydantic.errors.PydanticUserError: The `__modify_schema__` method is not supported in Pydantic v2. Use `__get_pydantic_json_schema__` instead in class `DIDUrl`.

The error appears to be misleading, because we do not have __modify_schema__ being referenced anywhere. The pydantic code that raises the exception is doing a check like this:

def _extract_get_pydantic_json_schema(tp: Any, schema: CoreSchema) -> GetJsonSchemaFunction | None:
    """Extract `__get_pydantic_json_schema__` from a type, handling the deprecated `__modify_schema__`."""
    js_modify_function = getattr(tp, '__get_pydantic_json_schema__', None)

    if hasattr(tp, '__modify_schema__'):
        from pydantic import BaseModel  # circular reference

        has_custom_v2_modify_js_func = (
            js_modify_function is not None
            and BaseModel.__get_pydantic_json_schema__.__func__  # type: ignore
            not in (js_modify_function, getattr(js_modify_function, '__func__', None))
        )

        if not has_custom_v2_modify_js_func:
            cls_name = getattr(tp, '__name__', None)
            raise PydanticUserError(
                f'The `__modify_schema__` method is not supported in Pydantic v2. '
                f'Use `__get_pydantic_json_schema__` instead{f" in class `{cls_name}`" if cls_name else ""}.',
                code='custom-json-schema',
            )

Well, I can't make much sense of it this late at night, but it appears I've not implemented __get_pydantic_json_schema__ correctly! Will try again in due time

ff137 commented 5 months ago

The above error was indeed misleading. Very bizarre, but it appears to be that pydid and pydantic versions got mixed up in dependency tree for where I was testing. Everything does appear to be in order. There's just some breaking changes to be aware of -- the validate classmethod has been renamed to model_validate so that it is consistent with the new pydantic v2 naming convention. But this means other libraries will need to be refactored to call model_validate instead. So, I'm just gonna add the validate method as it was.

ff137 commented 5 months ago

@dbluhm good stuff! All checks have passed in https://github.com/hyperledger/aries-cloudagent-python/pull/2919

dbluhm commented 5 months ago

Some minor conflicts to poetry lock after I merged in dependabot PRs

ff137 commented 5 months ago

lock file updated :+1:

ff137 commented 4 months ago

@dbluhm the following lines: PossibleServiceTypes = Union[DIDCommV1Service, DIDCommV2Service, UnknownService] service: Optional[List[PossibleServiceTypes]] = None over here is causing the following warning to pop up in an ACA-Py run:

UserWarning: Pydantic serializer warnings:
   Expected `Union[DIDCommV1Service, DIDCommV2Service, UnknownService]` but got `Service` - serialized value may not be as expected

Is there a new Service type that needs to be added to that PossibleServiceTypes list?

dbluhm commented 4 months ago

Good question. I think the behavior in 1.x of pydantic was happy to consider Service to be an UnknownService. It seems 2.x is pickier. We might be able to get the same functionality by just using Service instead of UnknownService in that union but I say that without looking too deeply.