AcademySoftwareFoundation / openexr

The OpenEXR project provides the specification and reference implementation of the EXR file format, the professional-grade image storage format of the motion picture industry.
http://www.openexr.com/
BSD 3-Clause "New" or "Revised" License
1.64k stars 620 forks source link

Imf::isOpenExrFile doesn't handle UTF-8 filenames properly #292

Open meshula opened 6 years ago

meshula commented 6 years ago

@lgritz has implemented a workaround for inconsistent UTF8 filename handling in Imf::isOpenExrFile

https://github.com/OpenImageIO/oiio/pull/1941/commits/056507c93da60148b9af6ee0390eda45f01696c1

Fix should occur in OpenEXR directly methinks

lgritz commented 6 years ago

Note to those who may look at the above referenced OIIO PR: the magic is that in my overloaded Imf::IStream, I handled UTF-8 filenames properly when opening the files in the first place.

Feel free to swipe the code from OIIO, I'm happy to point somebody to the exact bits that are needed. The crux of it is this:

#ifdef _WIN32
std::wstring
Strutil::utf8_to_utf16 (string_view str)
{
    std::wstring native;
    native.resize(MultiByteToWideChar (CP_UTF8, 0, str.data(), str.length(), NULL, 0));
    MultiByteToWideChar (CP_UTF8, 0, str.data(), str.length(), &native[0], (int)native.size());
    return native;
}
#endif

FILE*
Filesystem::fopen (string_view path, string_view mode)
{
#ifdef _WIN32
    // on Windows fopen does not accept UTF-8 paths, so we convert to wide char
    std::wstring wpath = Strutil::utf8_to_utf16 (path);
    std::wstring wmode = Strutil::utf8_to_utf16 (mode);
    return ::_wfopen (wpath.c_str(), wmode.c_str());
#else
    // on Unix platforms passing in UTF-8 works
    return ::fopen (path.c_str(), mode.c_str());
#endif
}
meshula commented 6 years ago

Thank you sir!

claudeha commented 3 years ago

One of my Windows users recently ran into an issue with OpenEXR and non-ASCII paths. I confirmed the issue in Wine. I cross-compile my program for Windows from Linux with MinGW64. The other libraries I use (libpng etc) had no such issues. I got non-ASCII paths working (at least in Wine, Microsoft Windows remains to be tested) by deleting all the Windows-specific stuff in ImfStdIO.cpp:

sed -i "s/#ifdef _WIN32/#if 0/g" OpenEXR/IlmImf/ImfStdIO.cpp
meshula commented 3 years ago

This looks another case where we need a MINGW check in addition to the WIN32 check. Probably we should grep the whole codebase for other such instances.

alvinhochun commented 2 years ago

The problem is that it now treats char-strings as UTF-8. In some way this is a good thing, but the typical convention on Windows is that char-strings (PSTR / LPSTR) when used with the -A version of WinAPI are encoded in the system ANSI code page (ACP), which in most cases is not UTF-8. To convert a string from ACP to UTF-8, I think the "native" way is to first use MultiByteToWideChar to convert it to UTF-16 as a wchar_t-string (LPWSTR), then back with WideCharToMultiByte to convert it to UTF-8. (Amusingly, OpenEXR will then convert it back to UTF-16 again to open the file with the wide API.)

Edit: While you can convert paths from ACP to UTF-8, it is probably best to always keep your paths in UTF-16 if they come from WinAPI in the first place, because ACP cannot represent all characters supported in Unicode

If you are unaware of this and pass non-ASCII paths encoded in the system ACP to OpenEXR, you will fail to open the file. I don't think it has anything to do with MinGW.

meshula commented 2 years ago

Ah that's right, claudeha's issue isn't specific to MinGW, but an issue encountered under their cross compilation set up.