HowardHinnant / date

A date and time library based on the C++11/14/17 <chrono> header
Other
3.14k stars 677 forks source link

Invalide unicode character in month name from to_stream #726

Open wjones127 opened 2 years ago

wjones127 commented 2 years ago

Hello! I've been working on adding timezone support to Windows in Apache Arrow (https://github.com/apache/arrow/pull/12536), but I've gotten stuck on an odd failure in to_stream in the French UTF-8 locale. It produces an invalid unicode character in the month name. I'm using MSVC 19.30.30709.0 if that helps.

I have stripped down the test to this:

using namespace std::chrono;
using namespace arrow_vendored::date;

auto d = local_days{August / 18 / 2021};
auto locale_str = "fr_FR.UTF-8";
std::locale loc(locale_str);

std::ostringstream oss;
oss.imbue(loc);
to_stream(oss, "%d %B %Y", d);
oss.clear();
const auto s = oss.str();

EXPECT_EQ(s, "18 août 2021"); // Fails

Test output:

25:  error: Expected equality of these values:
25:   s
25:     Which is: "18 ao\xFBt 2021"
25:   "18 août 2021"
25:     Which is: "18 ao\xC3\xBBt 2021"
25:     As Text: "18 août 2021"