swiftlang / swift

The Swift Programming Language
https://swift.org
Apache License 2.0
67.56k stars 10.35k forks source link

[SR-454] Reimplement float -> string with better algorithms #43071

Closed 05262b81-54a9-4fe1-bf6a-96f8042de10e closed 6 years ago

05262b81-54a9-4fe1-bf6a-96f8042de10e commented 8 years ago
Previous ID SR-454
Radar None
Original Reporter @lilyball
Type Improvement
Status Resolved
Resolution Done
Additional Detail from JIRA | | | |------------------|-----------------| |Votes | 0 | |Component/s | Standard Library | |Labels | Improvement | |Assignee | @tbkka | |Priority | Medium | md5: 6d5a4cc175f95505ead8bf96760e0182

duplicates:

relates to:

Issue Description:

Right now, Swift uses the equivalent of printf with a format of %.*g and a precision of std::numeric_limits<T>::digits10 to convert floating-point values into strings. While being relatively simple, this has the downside of not being exact. It means that if you take any arbitrary floating-point value, convert it to a string, and back to the same floating-point type, you're not guaranteed to get the same value.

Instead we should reimplement this using some algorithm that does give an exact result. The ideal result is the "simplest" string that produces the input float when parsed again. But the only actual hard requirements should be an exact result.

Python uses David Gay's dtoa library for performing exact conversions. The downsides to his algorithm are it appears to be incredibly complex (and so we'd likely want to just use the C library directly instead of trying to reimplement the algorithm) and it uses allocation. I'm not sure how the actual performance is, and I don't know if we necessarily want to require allocation for basic functionality like this (i.e. if we ever want to have a subset of the stdlib that's allocation-free for use in embedded systems we'd need an algorithm that doesn't do allocation). FWIW, I believe this library is what's used in many libc's.

Around May of last year Rust switched its implementation to a combination of Grisu3 and Dragon4. David Gay's algorithm was evaluated but discarded because Rust needed an algorithm without allocation, and also because Rust wanted the implementation in Rust itself and Gay's algorithm was so complicated nobody wanted to try and reimplement it. According to the comments, both Grisu3 and Dragon4 are exact algorithms, and Grisu3 is fast but incomplete, whereas Dragon4 is slow but complete. I believe Rust tries to use Grisu3 and falls back to Dragon4 for values that can't be formatted by Grisu3.

Some links for context:

David Gay's algorithm (PDF)
David Gay's dtoa library
Florian Loitsch's Grisu3 (PDF)
Stack Overflow comment saying Grisu3 is much faster than dtoa
Steel & White's Dragon4 (PDF)
rust-lang/rust#24612 PR implementing Grisu3+Dragon4
rust-lang/rust#24556 issue with relevant discussion
rust-lang/rust#24557 issue with more discussion

05262b81-54a9-4fe1-bf6a-96f8042de10e commented 8 years ago

43064 is a related issue about string -> float conversion

tbkka commented 6 years ago

This is related to #42728 and https://github.com/apple/swift-corelibs-foundation/issues/4412

tbkka commented 6 years ago

I'm marking this as a duplicate of #42728. I've been working on a fix for this and will discuss it further there.

05262b81-54a9-4fe1-bf6a-96f8042de10e commented 6 years ago

I disagree, #42728 is simply complaining that printing Double doesn't have enough precision, which is a separate issue from the question of what algorithm to use.

tbkka commented 6 years ago

More modern algorithms always pick the right precision, thus addressing both bugs.

I have a "better algorithm" implemented and tested and am preparing it for a PR quite soon. (It's a variant of Grisu2 that includes insights from Errol4.)

05262b81-54a9-4fe1-bf6a-96f8042de10e commented 6 years ago

If #42728 is to be fixed by changing the algorithm, shouldn't #42728 be duped onto this?

tbkka commented 6 years ago

See https://github.com/apple/swift/pull/15474

tbkka commented 6 years ago

This has been done in PR #15474.