tfussell / xlnt

:bar_chart: Cross-platform user-friendly xlsx library for C++11+
Other
1.47k stars 415 forks source link

When outputting file paths, emojis will not be displayed if they are included in the path, here are my thoughts. #709

Open DamonGX opened 1 year ago

DamonGX commented 1 year ago

//#ifdef _MSC_VER //std::wstring path::wstring() const //{ // std::wstring_convert<std::codecvt_utf8> convert; // return convert.from_bytes(string()); //} //#endif

ifdef _MSC_VER

include

std::wstring path::wstring() const { const char str{string().c_str()}; size_t strSize = strlen(str); int unicodeSize = MultiByteToWideChar(CP_UTF8, 0, str, strSize, NULL, 0); wchar_t unicodeStr = new wchar_t[unicodeSize]{L'\0'}; MultiByteToWideChar(CP_UTF8, 0, str, strSize, unicodeStr, unicodeSize); return {unicodeStr}; }

endif

doomlaur commented 1 year ago

I'm not sure which version of xlnt you are using, but if you're using version 1.5.0, you should maybe use the latest commit from the master branch instead. The issue you describe has been fixed by pull request #607 by using std::wstring_convert<std::codecvt_utf8_utf16<wchar_t>>. As explained on cppreference, the previous code converted UTF-8 to UCS-2 (the predecessor of UTF-16) on Windows, causing Unicode code points that need 4 bytes (like emojis) to fail. Unfortunately, the fix has not been released in a stable version of XLNT yet, but at least in my experience, the master branch seems to be even more stable than version 1.5.0, as it contains many bugfixes - so you should definitely give it a go.

For the record: all conversion functions provided by the C++ Standard library have been deprecated in C++17. Since wide strings are only used on Windows, your solution is a very good alternative 👍 In fact, I'm already using that in some projects. To avoid the memory leak at the end of your code snippet (the wchar_t array never gets deleted) and to avoid copying unnecessarily, the alternative to std::wstring_convert could be the following (slightly adapted version of your code):

#ifdef _MSC_VER

#include <stringapiset.h>

std::wstring path::wstring() const
{
    const std::string & path_str = string();
    int size = MultiByteToWideChar(CP_UTF8, MB_ERR_INVALID_CHARS, path_str.c_str(), static_cast<int>(path_str.length()), nullptr, 0);

    if (size > 0)
    {
        std::wstring path_converted(size, L'\0');
        size = MultiByteToWideChar(CP_UTF8, MB_ERR_INVALID_CHARS, path_str.c_str(), static_cast<int>(path_str.length()), path_converted.data(), size);
        return path_converted;
    }
    else
    {
        return {};
    }
}
#endif
DamonGX commented 1 year ago

Wow.Thank you for your reply. Your code has taken into account a memory leak issue that I did not consider. Based on your explanation, I have also learned a lot. Finally, thank you again