zeux / pugixml

Light-weight, simple and fast XML parser for C++ with XPath support
http://pugixml.org/
MIT License
4.01k stars 728 forks source link

PUGIXML_WCHAR_MODE and error messages. #591

Closed adontz closed 1 year ago

adontz commented 1 year ago

As far as I understand PUGIXML_WCHAR_MODE exists to make life under Windows easier, since all modern Windows API functions require UCS-2/UTF-16.

So, my question/suggestion is maybe error messages should be UTF-16 too, if PUGIXML_WCHAR_MODE is defined?

Changing "const char" to "const char_t" and wrapping constants in PUGIXML_TEXT seems to be enough, but I'm not familiar with the codebase and maybe missing something.

zeux commented 1 year ago

Sure - one perspective here is that the error description is a string and strings are UTF-16/32 in wchar mode so it should be UTF-16/32.

Another perspective is that the strings that change behavior in wchar mode are strings that represent XML content and need to be processed as Unicode, and the error description is ASCII. An example of another type of string that doesn't change behavior under wchar mode is file paths in load/save_file arguments. In a Windows application, I'd expect that if the user needs to be presented with the parsing failure reason, xml_parse_result::status should likely be used instead to produce a Unicode localized error which would be specific to the application. Also, any Windows application that has to work with C++ exceptions already should be used to processing ASCII exception reasons.

But really, it doesn't matter - changing this would break existing programs, so for something this minor that has good reasons to not use UTF-16/32 in the first place, this will definitely not change as it's been an ASCII string since WCHAR mode was introduced. It should be trivial to get a UTF-16/32 string using pugi::as_wide if necessary.