Closed mpenkov closed 7 years ago
Not really intended, more like side-effect of what the thing does, i.e. to not have quotes around every single thing and such. Note that integers in YAML can be represented in all sorts of ways, i.e. 1_000 or 5e10 or 0xff should all be escaped to remain strings too.
So I'd say if you're looking for something that reliably serializes-deserializes stuff, use PyYAML directly, maybe with bunch of options from the top of the readme - these shouldn't break anything.
I think this is a good example of "bad things that can happen" to put into readme warning, thanks for bringing up, don't think it should be fixed though.
Hopefully this should be make such crazy behavior more obvious or at least harder to miss: https://github.com/mk-fg/pretty-yaml/#warning
A warning is better than nothing, but wouldn't it be easier (from the point of view of the user) for us to predict this craziness and code around it? I understand there'll always be edge cases that require convoluted solutions that are worse than the problem, but this particular case seems relatively simple: if it's a string, and it parses into a float, then keep it as a string (quote it).
You're right from the "output must be correct" perspective, but the way I tend to use this module, I'd prefer it to actually be incorrect but don't have quotes around every digit-only string in there, and maybe use str() after parser if such output should be parsed and it's a common-enough case.
Might be easy to add an option to do what you hinted at though, i.e. some init keyword plus a check in represent_stringish.
OK, I'll have a look at it.
I understand the idea behind "you probably don't want quotes around integers, so I won't put them" but in that case, can this be made configurable with a flag?
Certainly can be, as mentioned, I'm just lazy to do it - quite rarely use the module myself as it is, and certainly not for correct serialization, as also mentioned.
Edit: as a side-note, I think such option can be reasonably implemented via +1 parsing pass of PyYAML over string values serialized through this module, and if value parses back to original one - keep it, otherwise replace with a safe-style variant, or picking style by running extra dump/parse ops.
This is because unquoted strings in YAML are notoriously unsafe, and it's not just 123
that'll be an issue, but 1_000
, 1.123
, 1e123
, null
, false
, 2021-10-10T4:30:00
, etc etc - parser knows about all these, so using it should be a good way to catch all such cases.
For example:
Is this intended behavior? The problem is that the data we read back from this sort of YAML is different to what was written.