Open toughengineer opened 4 years ago
As far as I know libc++ charconv implementation is derived from the STLs one.
The reason that code is not using the builtins is most likely that their support in MSVC was only merged recently (See #795)
Thanks for the performance idea! MSVC's and libc++'s integer <charconv>
implementations were independently written, which is why we're lacking this technique.
I'm glad to help.
After I submitted this issue I realized that writing to a buffer on the stack and then memcpy
'ing to destination buffer may still be faster than calculating the needed length in advance.
The calculation uses log2 as counting bits instruction (which is several clocks on modern CPUs) and some multiplication+shifting+table lookup and fixup.
I tried to ask people who compared different implementations (e.g. to_chars
/to_string
and that of fmtlib, maybe @vitaut can comment on this) and got no definitive answer. So, I guess, one should investigate if it's really beneficial to spend time calculating the size in advance, and possibly take other considerations into account.
I ran some benchmarks in http://www.zverovich.net/2020/06/13/fast-int-to-string-revisited.html. Size computation (https://github.com/fmtlib/fmt/blob/60c43e870387affb07de32ed4fc2794ac7a9f60d/include/fmt/format.h#L787) is a relatively insignificant factor compared to other optimizations (see the difference between format_to
and format_int
).
Also, I have considered having different codepaths for "definitely enough space for the worst case" and "maybe not enough space, need to do precise bounds checking".
(Edit: @BillyONeal reminded me that we generate digits backwards, so even when there is plenty of space, we don't immediately know where to begin writing.)
(Edit x2: @BillyONeal also reminded me that Ryu's integer printing generates 2 digits at once, but the integer printing outside of Ryu generates only 1 digit at a time.)
For integers instead of filling the buffer from right to left and
memcpy
'ing the result back so the most significant digit (that is written last) ends up in the leftmost position, you can calculate the necessary buffer size ahead of time and put all the digits exactly at their final destination place.See example in libc++: https://github.com/llvm/llvm-project/blob/f54402b63a4f5b0b4b15e0f82ce8ff8501b206e6/libcxx/include/charconv#L164-L168 straight from the Bit Twiddling Hacks: https://graphics.stanford.edu/~seander/bithacks.html#IntegerLog10
memcpy
usage here: https://github.com/microsoft/STL/blob/550713eba27c5803addb6f615d3526400cb2df37/stl/inc/charconv#L137-L146