Open jk-jeon opened 1 year ago
fmt::format
is modeled after Python's str.format
where shortest refers to the precision, not the full output. std::format
diverged a bit because it was specified in terms of to_chars
.
I honestly feel like the shortest string is what people may expect, but that's of course just a subjective opinion. If you are going to change the behavior (or accept a PR that does so) in the future, it would be great. If not, please feel free to close this, but I think this difference needs to be documented anyway in places like https://fmt.dev/dev/api.html#compatibility-with-c-20-std-format.
I am open to PRs to address this backed by more analysis of the effects of the change and concrete examples.
Note that this also results in the rather surprising (to me) behavior that eg 123456792.0f formats as "123456790", the last digit apparently being wrong. But these roundtrip to the same float and 123456790 is shorter in the sense of having fewer sigfigs.
std::to_chars
formats it as 123456792.
This is unrelated and I am surprised that to_chars
produces "garbage" digits in this case.
Why is that "garbage" in this case? That value is perfectly representable as a float
. Here's a nicely formatted sweep of some values for example: https://godbolt.org/z/a3Y8r1v6K
Is there a way to control the number of digits that rounds in this particular case, and without exponential notation, or should this be filed as another issue altogether?
That's the term they used in Grisu paper. You can control precision, so there is no issue here.
So this seems to be because std::to_chars
is specified in terms of the number of characters, not the number of decimal digits. 123456784
and 123456780
are both of the shortest length, but the former is closer to the true value, so the implementation faithfully following the std spec must print the former.
So... this is interesting... we may need to look at what std::to_chars
implementers have done if we ever want this behavior to be implemented in fmt.
EDIT: Here is the relevant code from microsoft/STL:
https://github.com/microsoft/STL/blob/192a84008a59ac4d2e55681e1ffac73535788674/stl/inc/xcharconv_ryu.h#L1368 https://github.com/microsoft/STL/blob/192a84008a59ac4d2e55681e1ffac73535788674/stl/inc/xcharconv_ryu.h#L1406
As far as I understand, the default formatting option should produce the shortest output, not just in the number of significand digits, but also in the number of actual characters. At least that seems to be how
std::format
is specified, according to thestd::to_chars
specifications.However, it seems currently
fmt
picks the fixed-point format whenever the exponent is between-4
and16
, regardless of the number of characters it will produce: https://github.com/fmtlib/fmt/blob/3baaa8d899ced2f9ded80a3f142efd41808730e3/include/fmt/format.h#L2644Is this an intended divergence? Or maybe I misunderstood how
std::format
is specified?For what it's worth, it seems MS STL implementation of
std::format
does what I described.