python-poetry / tomlkit

Style-preserving TOML library for Python
MIT License
699 stars 100 forks source link

`tomlkit` 0.12.5 : Encoder contract interferes with external `TypeError`s raised in encoders #355

Closed sanmai-NL closed 5 months ago

sanmai-NL commented 5 months ago

Problem

register_encoder expects encoder functions to raise a TypeError when encoding fails. TypeErrors are an error class not specific to tomlkit, however, and this causes interference and unclear tracebacks.

For example:

> /Users/user/bla/__main__.py(50)encoder()
-> if any(
(Pdb) n
TypeError: any() takes exactly one argument (6 given)
(Pdb) c
...
  File "/Users/user/bla/.venv/lib/python3.12/site-packages/tomlkit/items.py", line 218, in item
    raise _ConvertError(f"Invalid type {type(value)}")
tomlkit.items._ConvertError: Invalid type <class 'Legitimateclass'>

Solution

Change the API to use a tomlkit-specific and fine-grained, encoder-specific exception class.

frostming commented 5 months ago

Can you show me how the TypeError occurs? It seems like the encoder is not well implemented.

Library and frameworks should be loose about input and strict about output.

sanmai-NL commented 5 months ago

@frostming I think the OP's backtrace is sufficient to answer your question. Not well implemented is besides the point here: non-standard libraries should define their own exception type(s) if they assume they're the only one to raise them. The example is to show that.

frostming commented 5 months ago

json's default= argument also expects a TypeError to skip the encoder, the error can also be raised on wrong argument, we just follow the same convention.

In [1]: def my_encoder(o):
   ...:     if any(1,2,3,4):
   ...:         return str(o)
   ...:     return None
   ...: 

In [2]: import json

In [3]: json.dumps({"a": object()}, default=my_encoder)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[12], line 1
----> 1 json.dumps({"a": object()}, default=my_encoder)

File /opt/homebrew/Cellar/python@3.11/3.11.8/Frameworks/Python.framework/Versions/3.11/lib/python3.11/json/__init__.py:238, in dumps(obj, skipkeys, ensure_ascii, check_circular, allow_nan, cls, indent, separators, default, sort_keys, **kw)
    232 if cls is None:
    233     cls = JSONEncoder
    234 return cls(
    235     skipkeys=skipkeys, ensure_ascii=ensure_ascii,
    236     check_circular=check_circular, allow_nan=allow_nan, indent=indent,
    237     separators=separators, default=default, sort_keys=sort_keys,
--> 238     **kw).encode(obj)

File /opt/homebrew/Cellar/python@3.11/3.11.8/Frameworks/Python.framework/Versions/3.11/lib/python3.11/json/encoder.py:200, in JSONEncoder.encode(self, o)
    196         return encode_basestring(o)
    197 # This doesn't pass the iterator directly to ''.join() because the
    198 # exceptions aren't as detailed.  The list call should be roughly
    199 # equivalent to the PySequence_Fast that ''.join() would do.
--> 200 chunks = self.iterencode(o, _one_shot=True)
    201 if not isinstance(chunks, (list, tuple)):
    202     chunks = list(chunks)

File /opt/homebrew/Cellar/python@3.11/3.11.8/Frameworks/Python.framework/Versions/3.11/lib/python3.11/json/encoder.py:258, in JSONEncoder.iterencode(self, o, _one_shot)
    253 else:
    254     _iterencode = _make_iterencode(
    255         markers, self.default, _encoder, self.indent, floatstr,
    256         self.key_separator, self.item_separator, self.sort_keys,
    257         self.skipkeys, _one_shot)
--> 258 return _iterencode(o, 0)

Cell In[1], line 2, in my_encoder(o)
      1 def my_encoder(o):
----> 2     if any(1,2,3,4):
      3         return str(o)
      4     return None

TypeError: any() takes exactly one argument (4 given)
sanmai-NL commented 5 months ago

It could be a suboptimal design, more of Python‘s standard library suffers from that. (But json is part of that, and tomlkit isn't, so TypeError is it‘s own.)

You could enforce using a subclass of TypeError for the custom encoder to improve the design.

To illustrate your example, can you show how the custom encoder gets invoked? I don't see that in the example which makes it harder to comment on the appropriateness of the exception message.

frostming commented 5 months ago

Okay, I am convinced.

To illustrate your example, can you show how the custom encoder gets invoked?

The example was wrong, I've updated it