Open alvinhochun opened 2 years ago
Tried making a quick benchmark with nanobench:
clang++ locale-perf-test.cpp -o locale-perf-test-mingw.exe -O2 -std=c++17
ns/op | op/s | err% | total | benchmark |
---|---|---|---|---|
89.72 | 11,145,303.12 | 1.8% | 2.40 | [ref] osstream with filler string |
28,449.35 | 35,150.19 | 0.6% | 2.42 | osstream << int (not C locale) |
1,045.37 | 956,599.96 | 0.8% | 2.41 | osstream << int (is C locale) |
8.12 | 123,093,432.05 | 1.6% | 2.38 | std::to_string(int) |
3.18 | 314,807,231.25 | 1.6% | 2.41 | std::to_chars(int) |
28,740.35 | 34,794.28 | 0.7% | 2.42 | osstream << double (not C locale) |
1,344.94 | 743,525.48 | 0.5% | 2.32 | osstream << double (is C locale) |
320.15 | 3,123,519.84 | 0.7% | 2.42 | std::to_string(double) |
46.89 | 21,324,673.41 | 0.5% | 2.43 | std::to_chars(double) |
g++ locale-perf-test.cpp -o locale-perf-test-mingwgcc.exe -O2
ns/op | op/s | err% | total | benchmark |
---|---|---|---|---|
216.24 | 4,624,530.54 | 0.8% | 2.39 | [ref] osstream with filler string |
237.27 | 4,214,571.27 | 0.9% | 2.40 | osstream << int (not C locale) |
238.98 | 4,184,446.21 | 1.1% | 2.45 | osstream << int (is C locale) |
9.60 | 104,189,551.72 | 2.3% | 2.43 | std::to_string(int) |
7.76 | 128,815,417.84 | 0.4% | 2.41 | std::to_chars(int) |
1,016.96 | 983,320.82 | 1.0% | 2.42 | osstream << double (not C locale) |
1,026.09 | 974,573.86 | 0.7% | 2.42 | osstream << double (is C locale) |
526.30 | 1,900,065.50 | 1.0% | 2.45 | std::to_string(double) |
55.40 | 18,049,144.14 | 0.5% | 2.42 | std::to_chars(double) |
cl locale-perf-test.cpp /Fe:locale-perf-test-msvc.exe /O2 /EHsc /std:c++17
ns/op | op/s | err% | total | benchmark |
---|---|---|---|---|
409.53 | 2,441,849.56 | 0.2% | 2.42 | [ref] osstream with filler string |
722.66 | 1,383,768.22 | 0.3% | 2.46 | osstream << int (not C locale) |
722.47 | 1,384,135.16 | 0.2% | 2.42 | osstream << int (is C locale) |
9.91 | 100,885,320.90 | 0.6% | 2.46 | std::to_string(int) |
9.35 | 106,907,753.86 | 0.3% | 2.42 | std::to_chars(int) |
1,065.95 | 938,132.66 | 0.8% | 2.42 | osstream << double (not C locale) |
1,065.88 | 938,188.30 | 0.8% | 2.41 | osstream << double (is C locale) |
611.46 | 1,635,419.73 | 0.4% | 2.42 | std::to_string(double) |
51.11 | 19,564,228.70 | 0.5% | 2.42 | std::to_chars(double) |
(Edited to add double-to-string for comparison.)
Just for future reference, @alvinhochun did post an initial patch to try to fix this, in https://reviews.llvm.org/D128600, by calling _vsnprintf_s_l
in the mingw codepath, like this:
diff --git a/libcxx/src/support/win32/locale_win32.cpp b/libcxx/src/support/win32/locale_win32.cpp
index 57ef94932ba0..05b8ce2e1db3 100644
--- a/libcxx/src/support/win32/locale_win32.cpp
+++ b/libcxx/src/support/win32/locale_win32.cpp
@@ -90,11 +90,15 @@ int snprintf_l(char* ret, size_t n, locale_t loc, const char* format, ...) {
int result = __stdio_common_vsprintf(
_CRT_INTERNAL_LOCAL_PRINTF_OPTIONS | _CRT_INTERNAL_PRINTF_STANDARD_SNPRINTF_BEHAVIOR, ret, n, format, loc, ap);
#else
- __libcpp_locale_guard __current(loc);
- _LIBCPP_DIAGNOSTIC_PUSH
- _LIBCPP_CLANG_DIAGNOSTIC_IGNORED("-Wformat-nonliteral")
- int result = vsnprintf(ret, n, format, ap);
- _LIBCPP_DIAGNOSTIC_POP
+ int result;
+ if (n != 0) {
+ va_list ap_copy;
+ va_copy(ap_copy, ap);
+ result = _vsnprintf_s_l(ret, n, _TRUNCATE, format, loc, ap_copy);
+ va_end(ap_copy);
+ }
+ if (n == 0 || result < 0)
+ result = _vscprintf_l(format, loc, ap);
#endif
va_end(ap);
return result;
This does build for both UCRT and msvcrt (_vsnprintf_s_l
and _vscprintf_l
do exist in msvcrt.dll, since Windows Vista), but formatting of long double
fails. On x86 mingw, long double
is the x87 80 bit floats, while MSVC environments (and msvcrt.dll and UCRT) have long double
as 64 bit. Libcxx in the CI configuration is built with the define __USE_MINGW_ANSI_STDIO
enabled, which redirects the common stdio functions to mingw reimplementations, which use the right expected form of long double
.
To fix this, the mingw stdio function reimplementations, used with __USE_MINGW_ANSI_STDIO
, would need to be extended with an implementation of _vsnprintf_s_l
.
Alternatively, the codepath using _vsnprintf_s_l
could be used conditionally, as long as sizeof(double) == sizeof(long double)
(i.e. for mingw, on non-x86), or if __USE_MINGW_ANSI_STDIO
isn't enabled. (llvm-mingw builds of libc++ aren't built with that enabled currently - so they don't format long doubles correctly in these cases anyway.)
(Alternatively, the whole formatting could be reimplemented in libc++, but that's probably a much larger undertaking.)
This is the same issue as https://github.com/llvm/llvm-project/issues/40476 but specifically for mingw. The prior issue was fixed for MSVC with https://reviews.llvm.org/D59525 and the follow-up https://reviews.llvm.org/D59727, but mingw still uses the slow path requiring
__libcpp_locale_guard
.I don't have proper benchmark data now, but converting numbers to strings using
std::ostringstream
is apparently quite slow on mingw. Especially when the current locale is not"C"
, thesetlocale
call takes the majority of time (see image below). When the locale is already"C"
it seems about a magnitude faster butsetlocale
is still quite significant in profiling data.(This is the code in use.)
The toolchain we use is llvm-mingw 20220323 (LLVM 14.0.0).