lebrice / SimpleParsing

Simple, Elegant, Typed Argument Parsing with argparse
MIT License
410 stars 52 forks source link

Error message when attempting to dumps_json a dataclass with ndarrays is unclear #245

Open zhiruiluo opened 1 year ago

zhiruiluo commented 1 year ago

Describe the bug The current error prompt for the following code snippet is kinda unclear for what the error is.

To Reproduce

from __future__ import annotations
from simple_parsing import Serializable
from dataclasses import dataclass
import numpy as np

@dataclass
class Config(Serializable):
    a: int = np.array(1.1)

def test_dumps_json():
    config = Config()
    print(config.dumps_json())

Expected behavior The error should either not show or raise a type error.

Actual behavior A clear and concise description of what is happening.

    def test_dumps_json():
        config = Config()
>       print(config.dumps_json())

test/test_issue_recurs.py:12: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
simple_parsing/helpers/serialization/serializable.py:158: in dumps_json
    return dumps_json(self, dump_fn=dump_fn, **kwargs)
simple_parsing/helpers/serialization/serializable.py:658: in dumps_json
    return dumps(dc, dump_fn=partial(dump_fn, **kwargs))
simple_parsing/helpers/serialization/serializable.py:653: in dumps
    return dump_fn(dc)
~/.conda/envs/p39c116/lib/python3.9/json/__init__.py:234: in dumps
    return cls(
~/.conda/envs/p39c116/lib/python3.9/json/encoder.py:199: in encode
    chunks = self.iterencode(o, _one_shot=True)
~/.conda/envs/p39c116/lib/python3.9/json/encoder.py:257: in iterencode
    return _iterencode(o, 0)
simple_parsing/helpers/serialization/encoding.py:27: in default
    return encode(o)
~/.conda/envs/p39c116/lib/python3.9/functools.py:888: in wrapper
    return dispatch(args[0].__class__)(*args, **kw)
~/.conda/envs/p39c116/lib/python3.9/functools.py:831: in dispatch
    impl = dispatch_cache[cls]
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <WeakKeyDictionary at 0x7fbc6d7f6190>, key = <class 'numpy.ndarray'>

    def __getitem__(self, key):
>       return self.data[ref(key)]
E       RecursionError: maximum recursion depth exceeded in comparison

~/.conda/envs/p39c116/lib/python3.9/weakref.py:416: RecursionError

The recursion error is misleading.

Desktop (please complete the following information):

lebrice commented 1 year ago

Numpy arrays are not json serializable. This is probably causing this error.

$ python
Python 3.9.15 (main, Nov 24 2022, 14:31:59) 
[GCC 11.2.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> import json
>>> json.dumps({"a": np.array(1.0)})
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/mila/n/normandf/.conda/envs/datamodules/lib/python3.9/json/__init__.py", line 231, in dumps
    return _default_encoder.encode(obj)
  File "/home/mila/n/normandf/.conda/envs/datamodules/lib/python3.9/json/encoder.py", line 199, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/home/mila/n/normandf/.conda/envs/datamodules/lib/python3.9/json/encoder.py", line 257, in iterencode
    return _iterencode(o, 0)
  File "/home/mila/n/normandf/.conda/envs/datamodules/lib/python3.9/json/encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type ndarray is not JSON serializable
zhiruiluo commented 1 year ago

Hi @lebrice, I figured that one possible way is to add a encode_default dispatch function for encode for the SimpleJsonEncoder.

@encode.register
def encode_default(obj: object) -> str:
    return json.dumps(obj)
simple_parsing/helpers/serialization/serializable.py:158: in dumps_json
    return dumps_json(self, dump_fn=dump_fn, **kwargs)
simple_parsing/helpers/serialization/serializable.py:658: in dumps_json
    return dumps(dc, dump_fn=partial(dump_fn, **kwargs))
simple_parsing/helpers/serialization/serializable.py:653: in dumps
    return dump_fn(dc)
~/.conda/envs/simple_parsing/lib/python3.9/json/__init__.py:234: in dumps
    return cls(
~/.conda/envs/simple_parsing/lib/python3.9/json/encoder.py:199: in encode
    chunks = self.iterencode(o, _one_shot=True)
~/.conda/envs/simple_parsing/lib/python3.9/json/encoder.py:257: in iterencode
    return _iterencode(o, 0)
simple_parsing/helpers/serialization/encoding.py:27: in default
    return encode(o)
~/.conda/envs/simple_parsing/lib/python3.9/functools.py:888: in wrapper
    return dispatch(args[0].__class__)(*args, **kw)
simple_parsing/helpers/serialization/encoding.py:146: in encode_default
    return json.dumps(obj)
~/.conda/envs/simple_parsing/lib/python3.9/json/__init__.py:231: in dumps
    return _default_encoder.encode(obj)
~/.conda/envs/simple_parsing/lib/python3.9/json/encoder.py:199: in encode
    chunks = self.iterencode(o, _one_shot=True)
~/.conda/envs/simple_parsing/lib/python3.9/json/encoder.py:257: in iterencode
    return _iterencode(o, 0)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <json.encoder.JSONEncoder object at 0x7fa79b15b070>, o = array(1.1)

    def default(self, o):
        """Implement this method in a subclass such that it returns
        a serializable object for ``o``, or calls the base implementation
        (to raise a ``TypeError``).

        For example, to support arbitrary iterators, you could
        implement default like this::

            def default(self, o):
                try:
                    iterable = iter(o)
                except TypeError:
                    pass
                else:
                    return list(iterable)
                # Let the base class default method raise the TypeError
                return JSONEncoder.default(self, o)

        """
>       raise TypeError(f'Object of type {o.__class__.__name__} '
                        f'is not JSON serializable')
E       TypeError: Object of type ndarray is not JSON serializable

However, it is contradicted with https://github.com/lebrice/SimpleParsing/blob/68e16b2e72f6b1b29fd770f2b78ffa8ed44cc892/simple_parsing/helpers/serialization/serializable.py#L731-L737

lebrice commented 1 year ago

Hmm interesting idea, but encode is actually supposed to be agnostic to the serialization format. I'll keep thinking about this.