uiri / toml

Python lib for TOML
MIT License
1.08k stars 190 forks source link

dump breaks numpy.str_ into lists of characters #418

Closed rocheseb closed 1 year ago

rocheseb commented 1 year ago
import toml
import numpy
d = {"str":"hello","numpy.str_":numpy.str_("hello"),"numpy.str":numpy.str("hello")}
print(toml.dumps(d))

Outputs:

str = "hello"
"numpy.str_" = [ "h", "e", "l", "l", "o",]
"numpy.str" = "hello"

Is the breaking up of numpy.str_ strings into lists of characters intended? I would have expected the same output as str types

abrahammurciano commented 1 year ago

Not sure if it's the intended behaviour, but I stumbled across the same problem with StrEnum. It happens because the current implementation doesn't support subtypes of the supported basic types. For example:

from enum import StrEnum
import toml

class ValidStrings(StrEnum):
    FOO = "foo"

assert isinstance(ValidStrings.FOO, str) # OK
data = {"foo": ValidStrings.FOO}
print(toml.dumps(data))

Output:

foo = [ "f", "o", "o",]

I've done some investigating and it seems to be because in TomlEncoder.dump_value it tries to determine the dump function like this:

dump_fn = self.dump_funcs.get(type(v))

Perhaps isinstance should be used instead to play nice with inheritence? I'd suggest this:

dump_fn = next(f for t, f in self.dump_funcs.items() if isinstance(v, t), None)

I'd be happy to open a pull request for this.