microsoft / STL

MSVC's implementation of the C++ Standard Library.
Other
10.19k stars 1.5k forks source link

<charconv>: Investigate the Eisel-Lemire ParseNumberF64 algorithm for from_chars() #1610

Open StephanTLavavej opened 3 years ago

StephanTLavavej commented 3 years ago

The current implementation of from_chars() that I originally shipped in VS 2017 15.8 (released Aug 2018) is a refined version of the UCRT's strtod() with no algorithmic improvements. (At the time I concluded, apparently correctly, that no substantially better algorithms were known, and the UCRT was the only practical option regarding licensing.)

Now that Future Technology :artificial_satellite: is available, we can look into using it. (After finishing C++20!)

lemire commented 2 years ago

It works for both 32-bit and 64-bit floats. It has been adopted by Rust as of 1.55. It is part of the LLVM libc standard library. It has also been adopted by Microsoft in C# as of .NET7. It is part of the standard C++ library under Linux as of GCC 12. It is part of Go as of 1.16.

pdimov commented 1 year ago

Some updates on this:

The current (work in progress) implementation of Boost.Charconv, which uses the old Lemire algorithm (in the paper, not the improved one in the fast_float repo), has the following performance:

Microsoft Visual C++ version 14.3
Dinkumware standard library version 650

                std::strtox<float>, scientific:  4482 ms (s=-2.83944e+20)
            std::from_chars<float>, scientific:  3348 ms (s=-2.83944e+20)
boost::charconv::from_chars<float>, scientific:  2172 ms (s=-2.83944e+20)

                std::strtox<double>, scientific:  8599 ms (s=1.47618e+299)
            std::from_chars<double>, scientific:  7262 ms (s=1.47618e+299)
boost::charconv::from_chars<double>, scientific:  2872 ms (s=1.47618e+299)

                std::strtox<float>, general:  4272 ms (s=-2.83944e+20)
            std::from_chars<float>, general:  3294 ms (s=-2.83944e+20)
boost::charconv::from_chars<float>, general:  2136 ms (s=-2.83944e+20)

                std::strtox<double>, general:  8584 ms (s=1.47618e+299)
            std::from_chars<double>, general:  7227 ms (s=1.47618e+299)
boost::charconv::from_chars<double>, general:  2863 ms (s=1.47618e+299)

                std::strtox<float>, uint64:  3527 ms (s=1.01167e+19)
            std::from_chars<float>, uint64:  1951 ms (s=1.01167e+19)
boost::charconv::from_chars<float>, uint64:  2393 ms (s=1.01167e+19)

                std::strtox<double>, uint64:  3538 ms (s=1.01167e+19)
            std::from_chars<double>, uint64:  2000 ms (s=1.01167e+19)
boost::charconv::from_chars<double>, uint64:  2367 ms (s=1.01167e+19)

Benchmark source here, loosely based on a Boost mailing list posting by @StephanTLavavej.

From these numbers it looks like using Lemire's algorithm would be a significant improvement, doubly so because Boost.Charconv uses a strtox fallback at present, which is slower than MSSTL std::from_chars.

In further news, Prof. Lemire has graciously agreed to license the new fast_float code under BSL: https://github.com/fastfloat/fast_float/pull/203

lemire commented 1 year ago

@pdimov Indeed. We are going to relicense the library with BOOST. It is just takes a little of time, to reach out to contributors, but it won't be long.

lemire commented 1 year ago

Version 5.0.0 of the fast_float library is now available under the Boost license (among others). There was unanimous agreement.

https://github.com/fastfloat/fast_float/releases/tag/v5.0.0

lemire commented 1 year ago

Note that the fast_float library is part of WebKit and GCC (as of GCC 12). It is very well tested and its performance has received a lot of tuning. We encourage its wide adoption.