palantir / conjure-python-client

Python client and JSON encoders for use with generated Conjure clients
Apache License 2.0
7 stars 20 forks source link

Feature Request: Allow deserialization of hyphenated fields #132

Open Rkuro opened 1 year ago

Rkuro commented 1 year ago

What happened?

Describe what you've observed and why it's bad (please include whatever stacktraces/version numbers you can). Clear steps to reproduce the behaviour are always helpful too!

A very common use case for this type of capability is the ability to easily serialize/deserialize typed objects from a shared interface, but we need the ability to be a bit more flexible when deserializing objects into Conjure classes or we're going to need to use a different tool to share interfaces (which would be a bummer since I think its the right tool here).

Example:

Input object (python dictionary): { "foo-bar": "baz" }

Desired Conjure Object:

Banana(foo_bar: baz)

Currently - this will result in an empty Banana(FooBar: None) object which can potentially be misleading.

What did you want to happen?

I realize that having strict standards is generally a good thing, especially in an area with mostly solved problems (like IR object representation), so I am not proposing any changes in the default behavior.

Instead, I am arguing there should be a separate, non-standard set of "DecodingOptions" that should be available to the user upon request - with, for example, an option to loosen restrictions around deserialization and allow fields with hyphens to be able to deserialize into objects.

An implementation of what this might look like: PR- https://github.com/palantir/conjure-python-client/pull/131

Rkuro commented 1 year ago

Reason this doesn't work today is due to the rules defined in the test-cases yml file: https://github.com/palantir/conjure-verification/blob/85ec2c15cde616415e79d6b3b03c34559e287ab0/master-test-cases.yml#L390

  - type: KebabCaseObjectExample
    positive:
        - '{"kebab-cased-field":1}'
    negative:
        - '{"kebabCasedField":1}'
        - '{"kebab_cased_field":1}'

For example:

Given a generated conjure bean like so:

class CreateDatasetRequest(ConjureBeanType):
    @classmethod
    def _fields(cls) -> Dict[str, ConjureFieldDefinition]:
        return {
            "file_system_id": ConjureFieldDefinition("fileSystemId", str),
            "path": ConjureFieldDefinition("path", str),
        }

    _file_system_id: str = None
    _path: str = None

    def __init__(self, file_system_id: str, path: str) -> None:
        self._file_system_id = file_system_id
        self._path = path

    @property
    def file_system_id(self) -> str:
        return self._file_system_id

    @property
    def path(self) -> str:
        return self._path

Called like this:

def test_object_with_hyphen():
    decoded = ConjureDecoder().decode({"file-system-id": "foo", "path": "bar"}, CreateDatasetRequest)
    assert decoded == CreateDatasetRequest("foo", "bar")

And this implementation:

@classmethod
    def decode_conjure_bean_type(cls, obj, conjure_type):
        """Decodes json into a conjure bean type (a plain bean, not enum
        or union).
        Args:
            obj: the json object to decode <----------------------------------- #  {"file-system-id": "foo", "path": "bar"}
            conjure_type: a class object which is the bean type
                we're decoding into
        Returns:
            A instance of a bean of type conjure_type.
        """
        deserialized: Dict[str, Any] = {}

        for (
            python_arg_name,    <----------------------------------- #  ends up being "'file_system_id'"
            field_definition,
        ) in conjure_type._fields().items():
            field_identifier = field_definition.identifier  <----------------------------------- # ends up being "'fileSystemId'

             # only checking the field_identifier means incorrectly flagging the valid hyphenated case here
            if field_identifier not in obj or obj[field_identifier] is None:
                # Logic passes here
                cls.check_null_field(
                    obj, deserialized, python_arg_name, field_definition
                )
            else:
                value = obj[field_identifier]
                field_type = field_definition.field_type
                deserialized[python_arg_name] = cls.do_decode(
                    value, field_type
                )
        return conjure_type(**deserialized)

which passes through to

@classmethod
    def check_null_field(
        cls, obj, deserialized, python_arg_name, field_definition
    ):
        type_origin = get_origin(field_definition.field_type)
        if isinstance(field_definition.field_type, ListType):
            deserialized[python_arg_name] = []
        elif isinstance(field_definition.field_type, DictType):
            deserialized[python_arg_name] = {}
        elif isinstance(field_definition.field_type, OptionalType):
            deserialized[python_arg_name] = None
        elif type_origin is OptionalTypeWrapper:
            deserialized[python_arg_name] = None
        elif type_origin is list:
            deserialized[python_arg_name] = []
        elif type_origin is dict:
            deserialized[python_arg_name] = {}
        else:
            raise Exception( <-----------------------------------  this is where we are throwing the exception
                "field {} not found in object {}".format(
                    field_definition.identifier, obj
                )
            )

This fails with the above exception:

Exception: field fileSystemId not found in object {'file-system-id': 'foo', 'path': 'bar'}

When it shouldn't given the conjure client generator separates it out with underscores already and is fairly straightforward to at least look for it.

"file_system_id": ConjureFieldDefinition("fileSystemId", str),