rikyoz / bit7z

A C++ static library offering a clean and simple interface to the 7-zip shared libraries.
https://rikyoz.github.io/bit7z
Mozilla Public License 2.0
633 stars 116 forks source link

[Bug]: On Windows narrow returns empty strings for items #172

Closed vm2mv closed 11 months ago

vm2mv commented 11 months ago

bit7z version

4.0.x

Compilation options

BIT7Z_AUTO_FORMAT, BIT7Z_AUTO_PREFIX_LONG_PATHS, BIT7Z_BUILD_TESTS, BIT7Z_REGEX_MATCHING, BIT7Z_STATIC_RUNTIME

7-zip version

v23.01

7-zip shared library used

7z.dll / 7z.so

Compilers

MSVC

Compiler versions

MSVC 2022

Architecture

x86_64, x86

Operating system

Windows

Operating system versions

Windows 10 build 1607 (without embeded ICU)

Bug description

On Windows narrow returns empty strings for items (without BIT7Z_USE_NATIVE_STRING).

#ifdef _WIN32
#ifdef BIT7Z_USE_SYSTEM_CODEPAGE
#define CODEPAGE CP_ACP
#define CODEPAGE_FLAGS 0
#else
#define CODEPAGE CP_UTF8
#define CODEPAGE_FLAGS WC_NO_BEST_FIT_CHARS
#endif
#else
...
#if !defined( _WIN32 ) || !defined( BIT7Z_USE_NATIVE_STRING )
auto narrow( const wchar_t* wideString, size_t size ) -> std::string {
    if ( wideString == nullptr || size == 0 ) {
        return "";
    }
#ifdef _WIN32
    const int narrowStringSize = WideCharToMultiByte( CODEPAGE,
                                                      CODEPAGE_FLAGS,
                                                      wideString,
                                                      static_cast< int >( size ),
                                                      nullptr,
                                                      0,
                                                      nullptr,
                                                      nullptr );
    if ( narrowStringSize == 0 ) {
        return "";
    }
...

WideCharToMultiByte returns zero GLE = 1004 (ERROR_INVALID_FLAGS)

See: https://learn.microsoft.com/en-us/windows/win32/api/stringapiset/nf-stringapiset-widechartomultibyte

For the code page 65001 (UTF-8) or the code page 54936 (GB18030, Windows Vista and later), dwFlags must be set to either 0 or WC_ERR_INVALID_CHARS. Otherwise, the function fails with ERROR_INVALID_FLAGS.

WC_NO_BEST_FIT_CHARS is not compatible with CP_UTF8

Steps to reproduce

No response

Expected behavior

No response

Relevant compilation output

No response

Code of Conduct

rikyoz commented 11 months ago

Hi!

WC_NO_BEST_FIT_CHARS is not compatible with CP_UTF8

You're right; I must have missed that part in the documentation!

Strangely, this problem has never happened to me in any of my (numerous) tests, and it still does not happen now...

Anyway, I'm working on a fix right now.

Thank you for reporting the issue! 🙏

rikyoz commented 11 months ago

Fixed in v4.0.3.

I could not test Windows 10 build 1607 specifically, but I managed to reproduce the problem, even though only on Windows 7.

In my tests, more recent Windows 10 versions, as well as Windows 11, have no problems when passing both CP_UTF8 and WC_NO_BEST_FIT_CHARS to WideCharToMultiByte. That's probably why the issue went unnoticed before your report!

Maybe Microsoft changed the behavior of WideCharToMultiByte at some point, and they forgot to update the documentation accordingly, or it is simply a bug in the more recent versions of Windows.

Anyway, it's better to stick to the documentation, and I fixed the code accordingly. I apologize for the delay in releasing the fix; testing took some time. Thank you again!

vm2mv commented 11 months ago

Thank you for the wonderful library and quick fix!