Open pirate opened 1 week ago
After looking more into the other options unfortunately they are worse in other ways, for example they can barely handle a fraction of the types that uiri/toml
supports :/
I guess we are stuck writing a custom TomlEncoder
class to work around the bugs for now.
import re
import toml
def better_toml_dump_str(val):
try:
return toml.encoder._dump_str(val)
except Exception:
# if we hit any of toml's numerous encoding bugs,
# fall back to using json representation of string
return json.dumps(str(val))
class CustomTOMLEncoder(toml.encoder.TomlEncoder):
def __init__(self, **kwargs):
super().__init__(**kwargs)
# override the dumper funcs for any types you need to support here:
self.dump_funcs[str] = better_toml_dump_str
self.dump_funcs[re.RegexFlag] = better_toml_dump_str
...
# this works:
>>> test_dict = {
"string_with_escape_sequences": "\033[00;00m",
"example_weird_type": re.IGNORECASE | re.UNICODE | re.MULTILINE,
# etc. try more weird types here
}
>>> print(benedict(test_dict).to_toml(encoder=CustomTOMLEncoder()))
string_with_escape_sequences = "\u001b[00;00m"
example_weird_type = "re.IGNORECASE|re.UNICODE|re.MULTILINE"
You're welcome to close this issue 🤷, up to you.
@pirate thank you for reporting this problem.
Actually, for decoding toml
on Python >= 3.11
, the tomlib
standard lib is used:
https://github.com/fabiocaccamo/python-benedict/blob/main/benedict/serializers/toml.py#L10
I think it would be cool to improve TOML support, would you like to submit a PR with a list of failing test cases (that should succeed with the perfect library)? So it would be much easier to test other TOML libraries or doing fixes in a different way.
Thanks so much for building python-benedict, it's really awesome!
So unfortunatley the underlying TOML library https://github.com/uiri/toml used by benedict has a bunch of long-time outstanding bugs that break parsing/loading and generally make it unsafe to load->dump->load the same string.
For example, you cannot dump any dict containing escape sequences without it throwing an exception:
But thats not it, there are many other fairly major string escaping, quoting, and parse/dump cycle inconsistency bugs that have bitten other projects using
uiri/toml
:Almost all of these are 2yr+ old, indicating they're probably not going to all get fixed anytime soon without significant increase in velocity. No harm no foul, it's open source we can't demand they go faster and they don't owe us anything, but maybe benedict could consider a different library with fewer major outstanding parser consistency issues?
Is benedict open to switching to a library without these issues? Maybe one of these:
I'll chip in $20 towards the work to make the switch if it's an option ⬇️
Upvote & Fund