Serialization of nested data classes

halirutan commented 2 years ago

I'm using SimpleParsing version 0.0.19post0 installed through pip. Consider the situation that you have several data classes with unrelated options. Each of these classes contains only primitive type variables and is itself easily serializable. Now, I want to have a wrapper data class that contains instances of the other classes and I want to be able to serialize (works) and deserialize it (doesn't work). I'm sure this is supposed to work and I'm only doing something wrong, but here is an example:

from __future__ import annotations
from dataclasses import dataclass
from simple_parsing.helpers import Serializable

@dataclass
class Opts1(Serializable):
    a: int = 64
    b: float = 1.0

@dataclass
class Opts2(Serializable):
    a: int = 32
    b: float = 0.0

@dataclass
class Wrapper(Serializable):
    opts1: Opts1 = Opts1()
    opts2: Opts2 = Opts2()

# Show that it's not possible to deserialize nested dataclasses
opts = Wrapper()
reconstructed = Wrapper.loads_json(opts.dumps_json())
print(reconstructed)

You get a warning from helpers/serialization/decoding.py:176 that it is "Unable to find a decoding function for type IncludedOpts1. Will try to use the type as a constructor." The deserialized Wrapper object does not contain instances of Opts1 and Opts2 but rather two dicts which shows that the deserialization failed.

I read through the example that used a Tensor and provided a custom encoding and decoding function, but I don't see why this should be necessary here. Both classes Opts1 and Opts2 are already perfectly serializable on their own.

On a related note: While serializing and deserializing e.g. Opts1 on their own works for JSON, it throws a similar warning for YAML but it reconstructs the object correctly

opts1 = Opts1()
reconstructed1 = Opts1.loads_yaml(opts1.dumps_yaml())
print(reconstructed1)

Expected Behavior: Such deserialization should work out of the box. If not, I'd appreciate any tips on how to do this correctly.

Desktop:

OS: Ubuntu
Version 0.0.19post0
Python version: 3.8

schmidtijoe commented 2 years ago

same Issue. simple-parsing installed through anaconda3.

OS: Windows 10
0.0.19post0
python version 3.8

Wrote simple save and load scripts for the wrapper using the subclass deserializers (e.g. .to_dict() method). However, i guess it really should work out of the box. Additionally, the example/serialization scripts, where specific encode and decode functions are shown, do not make a difference for the presented issue.

lebrice commented 2 years ago

Hey there! Thanks for posting, this is indeed a bug, and I'm very surprised the test suite hasn't spotted it before!

I'll check this out and get back to you.

Thanks again!

lebrice commented 2 years ago

This seems to be due to the from __future__ import annotations, I'll keep digging.

lebrice commented 2 years ago

Ok I got it figured out, and there's a fix for this waiting on CI in #127.

The problem is, the type annotations are read as strings when you use from __future__ import annotations. So the get_decoding_fn couldn't find the annotation within the dictionary of all registered decoding functions for each type (since it's a string, not a type!).

The warning it raises is also misleading, since it showed the name of the class as a string, without quotation marks.

lebrice commented 2 years ago

Ok the PR closed this automatically, but I'll make a quick simple-parsing 0.0.19.post1 release.

Thanks for pointing this out!

halirutan commented 2 years ago

Uh... thanks a lot for the fast response. I was really unsure if I overlooked something in the docs. Great that it works now.

lebrice / SimpleParsing

Serialization of nested data classes #126