Closed bencameron closed 3 years ago
cc: @tannergooding
This change is by design and was done for IEEE 754 compliance. This blog post goes into more detail about the range of changes that occurred: https://devblogs.microsoft.com/dotnet/floating-point-parsing-and-formatting-improvements-in-net-core-3-0/
This change is by design and was done for IEEE 754 compliance. This blog post goes into more detail
The blog does claim it is making parsers and formatters IEEE 754-2008 compliant. However I cannot find anything about strings (or parsing, or formatting) in the IEEE 754-2008 document.
Displaying "-0" makes .ToString()
on floating point numbers suitable for low-level debugging only. I think it's a bad decision.
However I cannot find anything about strings (or parsing, or formatting) in the IEEE 754-2008 document.
It is covered in 5.12 Details of conversion between floating-point data and external character sequences
:
5.12.1 External character sequences representing zeros, infinities, and NaNs The conversions (described in 5.4.2) from supported formats to external character sequences and back that recover the original floating-point representation, shall recover zeros, infinities, and quiet NaNs, as well as non-zero finite numbers. In particular, signs of zeros and infinities are preserved.
Displaying "-0" makes .ToString() on floating point numbers suitable for low-level debugging only. I think it's a bad decision.
It was what the majority of languages/frameworks that are IEEE 754 compliant (which is most of them) do (including C/C++, Rust, Javascript, Python, etc).
What format specifier (or workaround) should @charlesroddie use if he wants "default but with no -0" ?
There isn't a way to not print -0
with standard format strings. For those, you need to explicitly convert -0
to +0
(x != 0 ? x : 0
should work).
For custom numeric format strings "0;-0"
should currently work as zero hits the first path, not the second (we may have an issue on this, as custom numeric format strings were largely left untouched and we were to revisit them in the future).
But what about -0.01 formatted with "0" it will return "-0" you can't work around it with the ternary operator? Do you think there might be some legacy mode or something for all software migrating from older version of the Framework to 3+?
But what about -0.01 formatted with "0" it will return "-0" you can't work around it with the ternary operator?
The three format section separator works: https://docs.microsoft.com/en-us/dotnet/standard/base-types/custom-numeric-format-strings#the--section-separator. That is Console.WriteLine($"{-0.01:0;-0;0}");
will print 0
.
Do you think there might be some legacy mode or something for all software migrating from older version of the Framework to 3+?
There is not currently. There were also a lot more fixes that went in aside from the formatting fix for negative zero (including fixes around formatting/parsing correctness). Maintaining the incorrect legacy code has a non-trivial cost and I don't think it is worth the trade-off.
Having programs update use custom numeric format specifiers to control the printing behavior of 0
and -0
would be the most forward and backward compatible option.
@tannergooding Thanks for the link to the relevant part of IEEE 754-2008. The wording provides a suitable way to limit this. conversions from supported formats to external character sequences and back that recover the original floating-point representation
.
So for compliance there should exist a string serializer to a decimal string (which could be .ToString()
which can be reversed to give a float with exactly the same bits as the original) and which needs to write "-0" in some form.
There is no need for this to apply to rounded or formatted strings which are not designed to represent the floating point number precisely at a low level. The purpose of formatting is not for low-level debugging but to present to users running released code.
There is no need for this to apply to rounded or formatted strings which are not designed to represent the floating point number precisely at a low level
Other parts of the spec detail rounding requirements around specialized format strings. For example 5.12.2 External decimal character sequences representing finite numbers
:
Conversions to and from supported decimal formats shall be correctly rounded regardless of how many digits are requested or given.
The purpose of formatting is not for low-level debugging but to present to users running released code.
That is one purpose of formatting. It is equally valid for it to be used for serialization or other purposes. We are now performing IEEE 754 compliant formatting by default and provide custom numeric format strings that allow you to otherwise specialize the behavior (including removing the sign from 0
). This is no different than the formatting for any of the other integral types which are correctly rounded and displayed by default but can be customized using custom formatters when you have exact display requirements.
IEEE 754 binary floating-point values are usable in a range of scenarios, including in higher mathematics where -0
is a concept and can impact the computation of your overall algorithm (and which has to be taken into account when dealing with infinities, complex numbers, and other scenarios).
Other parts of the spec detail rounding requirements around specialized format strings. For example
5.12.2 External decimal character sequences representing finite numbers
:
OK 5.12.2 does contain conditions on preserving sign, which on a strict interpretation of the doc would include negative for zero.
including in higher mathematics where -0 is a concept
This is not correct. 0 is the additive identity in mathematics and "-0=0" follows from "-0=0+-0=0". Signed zero does not exist in mathematics and only in numerical computing (and even there mostly when discussing the IEEE 754 standard). Any article serach will confirm this.
We are now performing IEEE 754 compliant formatting by default and provide custom numeric format strings that allow you to otherwise specialize the behavior (including removing the sign from
0
).
These look too hard and unfinished. If someone has to look up a thread to work out how to avoid "-0" then this has already failed. The API should bear in mind that 99.999% of devs would not want "-0" and if displayed to end users it would always be a bug. So getting "0" should be as easy as possible and getting "-0" should not be easier and should have some sort of warning/xml doc/other hurdle.
Adding a String.FormatIEEE754
method for the special purpose of displaying negative zero as "-0" would be one option.
The IEEE 754 5.12.2 generates prolems across many languages where people are always asking how to get rid of "-0". Always add zero before converting to strings seems to be a popular recommendation! It's important to minimize this problem.
This is not correct. 0 is the additive identity in mathematics and "-0=0" follows from "-0=0+-0=0". Signed zero does not exist in mathematics and only in numerical computing (and even there mostly when discussing the IEEE 754 standard). Any article serach will confirm this.
It exists as a concept in various higher level mathematics and propagates sometimes necessary information about the direction from which a result was returned. The wikipedia article on Signed zero mostly talks about the computational usage around IEEE 754, but also briefly delves into some mathematical and scientific applications of it.
The IEEE 754 5.12.2 generates prolems across many languages where people are always asking how to get rid of "-0". Always add zero before converting to strings seems to be a popular recommendation! It's important to minimize this problem.
It is still the standard that most modern languages (and hardware) conform to and is what they expose as the default scenario. This applies to C, C++, Rust, Python, Javascript, Java, x86, ARM, C#, now .NET, and more. It is, in my opinion, something that people should be aware of when dealing with float
/double
(in any language), same as they should be aware that 0.3
is not exactly representable and that (a + b) + c
may be different from a + (b + c)
(even though in "regular" math, it is a completely safe thing to do). It just helps avoid potential bugs that creep in later.
The API should bear in mind that 99.999% of devs would not want "-0" and if displayed to end users
Developers wanting to display or take information from end users also need to take into account globalization or that certain numbers return "non pretty" results. This is really no different. If you want to customize the display, we provide custom numeric format strings for doing so and that will allow you to explicitly handle zero how you would like.
I do understand where you are coming from. It is an observable behavior change between 2.2/prior and 3.0+, it represents a change that various users may need to handle when porting applications , and it may also be a behavior that not everybody likes as the new default.
However, the change we made was deliberate and decided upon after getting numerous other issues filed on the behavior being non-compliant and non-compatible with other languages/frameworks (including how the C# compiler parses values on their end). We deemed the break to be overall worthwhile and that it put the framework in a better state moving forward. The change is documented alongside the other breaking changes here: https://docs.microsoft.com/en-us/dotnet/core/compatibility/2.2-3.0#floating-point-formatting-and-parsing-behavior-changed.
It exists as a concept in various higher level mathematics... The wikipedia article on Signed zero mostly talks about the computational usage around IEEE 754, but also briefly delves into some mathematical and scientific applications of it.
It is really not true that it exists as a concept in any pure mathematics. No mathematical usage is mentioned in the wikipedia article. There is only a negative temperature usage mentioned where it seems to be casual notation for either an informal negative infinitessmial or tending to zero from below. The linked reference is not publically accessible and a search does not bring up other usages on this topic.
It is, in my opinion, something that people should be aware of when dealing with float/double (in any language), same as they should be aware that 0.3 is not exactly representable
I agree that people should be aware that 0.3 is not exactly representable, i.e. of fundamental limitations of floating point representations. However negative zero is a quirk which, unless it's deliberately exposed to you as it is here, is only of interest if you maintain numerical libraries. I happen to know that it exists because I maintain numerical libraries, and the quirks of -0 and atan2 affect correctness. But it's really not of relevance to most programmers, who use floating point numbers but do not in a way that needs to distinguish between -0.0 and +0.0.
The change is documented alongside the other breaking changes here: https://docs.microsoft.com/en-us/dotnet/core/compatibility/2.2-3.0#floating-point-formatting-and-parsing-behavior-changed.
This links to https://devblogs.microsoft.com/dotnet/floating-point-parsing-and-formatting-improvements-in-net-core-3-0/ for more details, and the changes mentioned in these links - which seem perfectly good - don't address -0 at all.
I don't really know what form the warning should be given but any docs about formatting floats need to put -0 behavior and how to remove it up front because the default will be almost always incorrect in released code.
Closing as this is by design, as per the comments above.
Results: