Closed TechnikTobi closed 1 year ago
Hi, thanks for your feedback!
Yes, now that you point it out I see the problem. Based on your input regarding symlink_file
/ symlink_inner
and the two functions used for converting paths ( to_u16s
and maybe_verbatim
- which, as it turns out, also calls to_u16s
here in a first attempt to convert the path into a vector of u16 values and optionally resolves any problematic characters. So, from what I can tell, maybe_verbatim
should be more robust - why symlink_inner
only uses to_u16s
for one of its paths and not maybe_verbatim
for both of them is therefore a mystery to me). Inside symlink_inner
the converted paths are used to call the Win32 API function CreateSymbolicLinkW
(Microsoft documentation) which takes the file names as LPCWSTR
parameters - (long) pointers to constant wide strings (see here). At this point I don't that the inner workings of symlink_file
are not of much help to solve this issue as gexiv2 works a bit differently:
Taking a look at what happens to the path after converting it to a CString (pointer) - for example, save_to_file
calls gexiv2_metadata_save_file
, which then calls convert_path
see here to get a std::wstring
(according to some C++ documentation this is a string using wchar_t
for its characters, which seems to support UTF-16). This in turn calls g_utf8_to_utf16
, a GLib function (documentation, source code) for converting UTF-8 strings to UTF-16. So, gexiv2 actually seems to support paths that use UTF-16 (at least to some extend), even though they need to be provided as UTF-8 strings. This raises some questions:
g_utf8_to_utf16
only able to map characters that are valid in both character sets? In that case fully supporting UTF-16 paths within rexiv2 does not really make sense if gexiv2 isn't able to handle them anyway. Of course, this does not mean that my previously suggested fix is still a good idea - I think that at this point some differentiation between windows and non-windows systems in the conversion might be a good idea, e.g. using cfg. g_utf8_to_utf16
is in fact able to convert "all possible paths": What would the corresponding r_utf16_to_utf8 on our side have to look like, i.e. how to encode the UTF-16 string as UTF-8 so that g_utf8_to_utf16
would be able to reconstruct the UTF-16 path? EDIT1: There was one thing I forgot to look up before writing this reply: Checking if g_utf16_to_utf8
already exists! This could potentially be useful. END EDIT1Either way, to comment on your idea for a to_platform_path_ptr()
: Of course something like that would be nice in general. However, even if such a function exists, I don't think it would be of much help in this specific case due to how gexiv2 further processes the provided path-strings.
Where to go from here? I think answering the questions above would be a helpful next step, so I'll try to look further into them - under the assumption that the questions themselves make sense and are not "ill-formed" due to some misunderstanding from my side (unfortunately however, with university starting again soon I'm afraid that progress will slow down on my side as well 😅).
The OsStr::to_str() is perfectly valid in a Windows environment. It convert the OS rust path encoding (WTF-8 in Windows) to UTF-8 which is needed for the gexiv2_metadata_open_path() function. I understand that it’s stupid to do a double conversion but it’s there is no other way. See the code in glib-rs: https://github.com/gtk-rs/gtk-rs-core/blob/3ef9422402106821ede05317ec7b999128bf126e/glib/src/translate.rs#L639
Thanks again for the patch, and continuing to improve it! Is it's current status that you think it's basically ready to be merged, or were you planning any further changes?
Were you by any chance able to validate it on a Windows system?
I test it in a Windows MSYS2 environment and I got 100% test successful.
From my point of view this should be ready for merging.
I haven't been able to test this yet in an appropriate environment, so thank you @Dalan94 for testing!
Thank you again both for the patch and the testing and research on how it works on Windows.
I've updated the PR to be rebased onto the latest main
and to clean up some formatting that the linter was complaining about.
@TechnikTobi I hope it's OK with you that I also added you to the list of contributors in CONTRIBUTORS.md
.
I'll be merging this momentarily and then putting out a new release with this fix shortly thereafter.
To propose a solution for #59 I changed the creation of the c_str_path variables used in some functions (e.g.
new_from_path
) by inserting a conversion ofOsStr
tostr
first, before callingas_bytes()
, eliminating the need for the usage ofstd::os::unix::ffi::OsStrExt
, making the library hopefully compile under windows (as I don't have a Windows machine at hand I could not test it yet, but as the errors mentioned in the issue only involve the usage of the unix related functions there is a good chance that this resolves the issue). I left a copy of the old version in as a comment (line 279) above the new conversion for comparison.Running all tests (using
cargo test
) results in none of them failing. As at least one test includes a call tonew_from_path
, where the new conversion gets used without problems, the proposed solution should be fine.