lebrice / SimpleParsing

Simple, Elegant, Typed Argument Parsing with argparse
MIT License
384 stars 46 forks source link

Default handling logic for simpleparsing.parse doesn't respect dataclasses with custom decoder #266

Open dlwh opened 1 year ago

dlwh commented 1 year ago

Describe the bug

The default parsing logic seems to want to call asdict on a dataclass field even if it has a registered decoding function:

To Reproduce

def test_parse_helper_uses_custom_decoding_fn():

    @dataclasses.dataclass
    class Id:
        value: str

    register_decoding_fn(Id, (lambda x, drop_extra_fields: Id(value=x)))

    @dataclasses.dataclass
    class Person:
        name: str
        id: Id

    config_str = """
    name: bob
    id: hi
    """

    @dataclasses.dataclass
    class StrIdPerson:
        name: str
        id: str

    # ok
    assert get_decoding_fn(Person)({"name": "bob", "id": "hi"}) == Person("bob", Id("hi"))  # type: ignore

    with tempfile.NamedTemporaryFile("w", suffix=".yaml") as f:
        f.write(config_str)
        f.flush()

        # ok
        assert simple_parsing.parse(StrIdPerson, f.name, args=[]) == StrIdPerson("bob", "hi")

        # boom
        assert simple_parsing.parse(Person, f.name, args=[]) == Person("bob", Id("hi"))

Expected behavior Test should pass.

Actual behavior

Last assertion fails:


simple_parsing/parsing.py:1021: in parse
    parsed_args = parser.parse_args(args)
argparse.py:1826: in parse_args
    args, argv = self.parse_known_args(args, namespace)
simple_parsing/parsing.py:298: in parse_known_args
    self.set_defaults(config_file)
simple_parsing/parsing.py:405: in set_defaults
    wrapper.set_default(default_for_dataclass)
simple_parsing/wrappers/dataclass_wrapper.py:308: in set_default
    nested_dataclass_wrapper.set_default(field_default_value)
simple_parsing/wrappers/dataclass_wrapper.py:292: in set_default
    field_default_values = dataclasses.asdict(value)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

obj = 'hi'

    def asdict(obj, *, dict_factory=dict):
        """Return the fields of a dataclass instance as a new dictionary mapping
        field names to field values.

        Example usage:

          @dataclass
          class C:
              x: int
              y: int

          c = C(1, 2)
          assert asdict(c) == {'x': 1, 'y': 2}

        If given, 'dict_factory' will be used instead of built-in dict.
        The function applies recursively to field values that are
        dataclass instances. This will also look into built-in containers:
        tuples, lists, and dicts.
        """
        if not _is_dataclass_instance(obj):
>           raise TypeError("asdict() should be called on dataclass instances")
E           TypeError: asdict() should be called on dataclass instances

Desktop (please complete the following information):

Additional context

lebrice commented 1 year ago

Hey @dlwh , encoding = dc->dict (via the encode function) and decoding = dict->dc

Therefore asdict is more about encoding, and therefore I dont understand why calling asdict on the dataclass would be wrong if it has a custom decoding function registered.

lebrice commented 1 year ago

I dont completely understand the test in the issue atm. I'll take a better look a bit later today.

lebrice commented 1 year ago

When parsing with a config file, the custom decoding function isnt being used?

lebrice commented 1 year ago

Would you mind helping me out @dlwh ? I'm still having trouble understanding what the issue is here.

dlwh commented 1 year ago

Yes, dataclasses that have custom decode functions (like Id in the example) don't use that custom decode function when parsing from a "json dict", or when using argparse. I had a mostly complete fix in #267.

(FWIW, I ended up forking Pyrallis so I'm going it alone, so happy to just close this.)