s-knibbs / dataclasses-jsonschema

JSON schema generation from dataclasses
MIT License
166 stars 38 forks source link

FieldEncoder `json_schema` property not used by validator #186

Open jisaacstone opened 2 years ago

jisaacstone commented 2 years ago

I have an Enum where the numeric values are used internally, but the string values are used in the API calls.

I have solved this by using the FieldEncoder class. But it breaks when I call from_dict with validate=True.

Seems to be because MyClass._field_encoders[field.type].json_schema is different than MyClass._get_field_schema(field, schema_options)

Example:

#!/usr/bin/env python
# -*- coding: utf-8 -*-

from typing import Union, Type
from dataclasses_jsonschema import JsonSchemaMixin, FieldEncoder
from dataclasses import dataclass
import enum

class EnumEncoder(FieldEncoder):
    def __init__(self, enum_: Type[enum.Enum]):
        self.enum_ = enum_

    def to_wire(self, val: enum.Enum) -> str:
        return val.name

    def to_python(self, in_: Union[int, str]):
        if isinstance(in_, int):
            return self.enum_(in_)
        return self.enum_[in_]

    @property
    def json_schema(self):
        return {'type': 'string', 'enum': self.enum_._member_names_}

class MyEnum(enum.Enum):
    CHAIR = 0
    TABLE = 1
    STOOL = 2

JsonSchemaMixin.register_field_encoders({
    MyEnum: EnumEncoder(MyEnum)
})

@dataclass
class MyClass(JsonSchemaMixin):
    pk: str
    category: MyEnum

testdata = {'pk': 'abc', 'category': 'CHAIR'}
MyClass.from_dict(testdata, validate=False) #  MyClass(pk='abc', category=<MyEnum.CHAIR: 0>)
MyClass.from_dict(testdata, validate=True) #  ValidationError: 'CHAIR' is not of type 'integer'

jsf = MyClass._get_fields()[-1]

MyClass._field_encoders[jsf.field.type].json_schema #  {'type': 'string', 'enum': ['CHAIR', 'TABLE', 'STOOL']}
MyClass._get_field_schema(jsf.field, schemaoptions) # ({'type': 'integer', 'enum': [0, 1, 2]}, True)
dhagrow commented 1 year ago

I am also trying to use enum names instead of values (it would be a more sensible default, imo). The problem is that enums are handled as a special case in a different order for encode, decode, and schemas. A FieldEncoder works for encode and decode because the field encoders are checked before the enum special case. For schemas, the enum case is checked first, so any enum field encoder is never reached.

I think all that needs to change for this to work is to process the field encoders first in _get_field_schema. You would still need to register the encoder for each enum type. It would be great if there was a way to register a FieldEncoder for a type and all subtypes as well, but the field encoders would have to receive the subtype as an argument.