hydrogen-music / hydrogen

The advanced drum machine for Linux, macOS, and Windows
http://www.hydrogen-music.org
GNU General Public License v2.0
1.07k stars 172 forks source link

Fix UTF-8 support in `Drumkit` import and export using `libarchive` (#1957) #2022

Closed theGreatWhiteShark closed 2 months ago

theGreatWhiteShark commented 3 months ago

I finally got word from somebody at libarchive and it turns out one has to enforce UTF-8 encoding [1] in order to avoid segfaults in some versions when encountering non-Latin1 characters. Using this patch we can also support UTF-8 drumkit import and export.

Enforcing the locale works fine on systems not having the proper locale for the provided characters set. (It sounds like an edge case but we had several issues and discussions regarding this).

But what still needs to be properly handled is the case UTF-8 can not be enforced since the locale is not available on the system in the first place. Then, we just strip everything except of Roman letters and Arabic numbers of the name/path, import the kit into this location, and popup a dialog telling about the alternate location and that there was an encoding error + that they should set their system locale. Export is less problematic in this case.

Unfortunately, this is not it. libarchive is quite a pain. Our build pipeline (a real life safer in this one) caught an export failure on Ubuntu 20.04 - which I was able to reproduce locally. Well, at least a bit. The problem seems to be this one [2] but it is only occurring with libarchive installed from Canonicals package repos. Building the same sources with the same options and patches (https://launchpad.net/ubuntu/focal/amd64/libarchive13/3.4.0-2ubuntu1) I was not able to reproduce it.

But the result of this bug is quite severe. Not just UTF-8 support is affected, Hydrogen is not able to export drumkits at all anymore.

Since the original issue (https://github.com/hydrogen-music/hydrogen/issues/1957) only affects systems having their locale set up improperly and it was only reported to appear on Windows, I will be a little more conservative in here. The UTF-8 export patch is now only affecting Windows systems with libarchive >= 3.5.0. The latter should be guaranteed as we ship the corresponding DLL ourselves.

[1] github.com/libarchive/ libarchive/issues/2233 [2] github.com/libarchive/ libarchive/pull/1389/commits/c30f279475e2afd39f380622d2b53b157eb746d8