hydrogen-music / hydrogen

The advanced drum machine for Linux, macOS, and Windows
http://www.hydrogen-music.org
GNU General Public License v2.0
1.01k stars 172 forks source link

Export to mid/wav fails when file/directory tree contains diacritics or Cyrillic characters #1957

Closed msjasinski closed 4 days ago

msjasinski commented 3 months ago

Hydrogen version * : 1.2.3 Operating system + version : Windows 11 64bit (and earlier versions too) Audio driver + version : PortAudio


Export to mid or wav files fails if the filename/directory structure (e.g. D:\Hydrogen\emptySong.wav) contains letters/characters such as ążźćśł (Polish) or фыва (Russian). On the other hand, these characters work fine in .h2song files.

theGreatWhiteShark commented 3 months ago

Hey @msjasinski ,

Thanks for reporting!

Exporting the song to Lilypond or the current drumkit does not work either. But only on Windows. On Linux everything works fine.

I'll have a look.

theGreatWhiteShark commented 3 months ago

Could you check whether you are able to export MIDI or WAV files using this version of Hydrogen?

Drumkit import and export, however, will still not work with Cyrillic scripts or the Polish additions to the Latin alphabet in either filename or parent folders. That's a limitation of the compression library we use.

msjasinski commented 3 months ago

Hello @theGreatWhiteShark ! I tested it thoroughly and it works!!! Thank you very much! Keep up the good work! All the best

msjasinski commented 3 months ago

A related bug - when trying to open Hydrogen files (.h2song), containing characters as described above, from File Explorer (or similar - but not from Hydrogen open menu), I still get an error message hydrogen1_31gg8sugTI

theGreatWhiteShark commented 3 months ago

A related bug - when trying to open Hydrogen files (.h2song), containing characters as described above, from File Explorer (or similar - but not from Hydrogen open menu), I still get an error message

Puh. That's a hard one. I can reproduce it but I am afraid I can not fix it.

The encoding bugs within Hydrogen I was able to fix by enforcing UTF8 encoding and relying on Qt's - the framework we use - builtin functions for file interaction. It probably uses the UTF16 versions of the Windows API and everything works fine.

But arguments passed to the application during startup seem to be more difficult to handle. There QT uses the system encoding with seemingly no way to overwrite this behavior. But since both your and mine encoding is set wrong and does not allow for Cyrillic characters, Hydrogen only receives a messed up path with all non Latin-1 characters being lost and without any way to determine the songs original location.

(Why the encoding is off after installing the language kit and being able to set keyboard to e.g. Russian and write Cyrillic letters? No idea. I'm not a Windows person. But from the perspective of Hydrogen Windows is telling us that it does not support these characters.)

theGreatWhiteShark commented 3 months ago

@msjasinski could you do me a favor and install this version of Hydrogen and attach the log messages?

I patched it to report the system's encoding. Just to be entirely sure we are talking about the same issue here.

elpescado commented 3 months ago

Drumkit import and export, however, will still not work with Cyrillic scripts or the Polish additions to the Latin alphabet in either filename or parent folders. That's a limitation of the compression library we use.

That shortcoming of libarchive might already have been addressed:

https://github.com/libarchive/libarchive/pull/2016

elpescado commented 3 months ago

Alternatively, maybe using archive_read_open_fd with fd obtained from _wopen on Windows instead of archive_read_open_filename would work on Windows?

theGreatWhiteShark commented 3 months ago

That shortcoming of libarchive might already have been addressed:

https://github.com/libarchive/libarchive/pull/2016

Hmm. I'm not sure. Within the PR they stated the patch is only affecting native Windows builds. But we ship a version obtained from the MSYS2 repos. I don't know much about our Windows toolchain or libarchive in particular but I wouldn't be surprised if the library was configured to use the POSIX interface provided by MSYS instead of the underlying Windows API.

Alternatively, maybe using archive_read_open_fd with fd obtained from _wopen on Windows instead of archive_read_open_filename would work on Windows?

I thought about this too but decided not to implement it. I'm just not familiar enough with stability and backward compatibility of the Windows API, possible friction when putting it next to MSYS2 code etc. Handling archives is such a vital part of Hydrogen that I'm a little afraid to break things for Windows users. Especially since I am not using this OS.

I read this document: https://github.com/libarchive/libarchive/wiki/Filenames#the-problem and got the impression UTF-8 support is not yet "solved" in libarchive. But I get that this is an important topic for some users and I will have another look (and come up with at least a workaround).

theGreatWhiteShark commented 3 months ago

A related bug - when trying to open Hydrogen files (.h2song), containing characters as described above, from File Explorer (or similar - but not from Hydrogen open menu), I still get an error message

@msjasinski I added a wiki page on how to fix this issue by tweaking the Windows settings.

elpescado commented 3 months ago

Hmm. I'm not sure. Within the PR they stated the patch is only affecting native Windows builds. But we ship a version obtained from the MSYS2 repos. I don't know much about our Windows toolchain or libarchive in particular but I wouldn't be surprised if the library was configured to use the POSIX interface provided by MSYS instead of the underlying Windows API.

It's been a while since I've used Windows, but I was under impression that MSYS is a collection of POSIX shell utilities (bash, fileutils etc), but the actual compiler is MinGW, i.e. the "native" Windows build of GCC that links with msvcrt, as opposed to Cygwin a.k.a. "POSIX-on-Windows GCC". But I might be wrong, GCC on Windows is ultra-confusing.

theGreatWhiteShark commented 3 months ago

I took a look at the source code of libarchive and things are way more easy than I thought. They have dedicated methods for UTF-16 Windows API calls, like archive_write_open_filename_w. I wasn't aware of them previously as I used the man pages they linked on their official github page for reference. But it seems these are generated on FreeBSD and all the Windows-specific stuff was removed by #ifdefs. How inconvenient!

I'll rewrite import/export using these functions. Import with Cyrillic characters in drumkit path already works.

msjasinski commented 3 months ago

@msjasinski could you do me a favor and install this version of Hydrogen and attach the log messages?

I patched it to report the system's encoding. Just to be entirely sure we are talking about the same issue here.

Here it is: log.txt

Same if directory has no diacritics but filename does have them. If case of no diacritics, the file loads OK.

theGreatWhiteShark commented 2 months ago

Here it is: log.txt

Same if directory has no diacritics but filename does have them. If case of no diacritics, the file loads OK.

👍🏿 Nice. That's exactly how things are on my local machine.

theGreatWhiteShark commented 2 weeks ago

Hey @msjasinski ,

I had another look but it seems UTF-8 support for drumkit export is something we can not guarantee for now (due to limitations of a third party library we use). But all other save/open/export/import actions should now work properly.

Could you give this version of Hydrogen one more try to double check that everything is working?

theGreatWhiteShark commented 4 days ago

Closed with #1981. If anything does not work yet, please feel free to reopen this issue again.

msjasinski commented 4 days ago

Thanks very much! It seems alright. I'll comment if I find anything suspicious.

msjasinski commented 4 days ago

Export midi files and Export Song work.

I couldn't test properly, because in Windows 11 file associations (with multiple versions of the same program) don't work very well out of the box. Now I tried harder and this still does not work: "This does not work: when trying to open Hydrogen files (.h2song), containing characters as described above, from File Explorer (or similar - but not from Hydrogen open menu), I still get an error message"

eg. filenames: test 21ą.h2song test 21ф.h2song

theGreatWhiteShark commented 3 days ago

Hmm. Strange. On my machine both files are working.

I couldn't test properly, because in Windows 11 file associations (with multiple versions of the same program) don't work very well out of the box.

Maybe an older version is used while opening. Could you remove all existing versions of Hydrogen, install the latest one, and try again?

You can check the version via the menu in Info > About and it should be "Hydrogen-1.2.3-224-gd4f4da526".

msjasinski commented 3 days ago

No, It doesn't open .h2song files from file explorers in the newest version (224) if filename of pathname contains diacritics or cyrillics.

msjasinski commented 3 days ago

Also, which is a minor thing, but rather important for me, the program (regardless of how it is opened; this thing is not about the case above) doesn't start maximized (even if I select so in file properties). I prefer maximized program windows, usually one per screen. hydrogen_iaUCTfGDwJ It looks like this, with the upper part of title partly hidden above the screen.

theGreatWhiteShark commented 1 day ago

No, It doesn't open .h2song files from file explorers in the newest version (224) if filename of pathname contains diacritics or cyrillics.

Could you do so again with this version which is doing some additional logging? Then in "Info > Open log file" you can view all the log messages. Could you post them in here so I can have a look?

It looks like this, with the upper part of title partly hidden above the screen.

Hmm. Is this something new or did that always happen?