mixxxdj / mixxx

Mixxx is Free DJ software that gives you everything you need to perform live mixes.
http://mixxx.org
Other
4.55k stars 1.29k forks source link

Library scan and file browsing misses file names with non-UTF8 accented characters #6654

Closed mixxxbot closed 2 years ago

mixxxbot commented 2 years ago

Reported by: jtmuehlberg Date: 2012-10-02T09:25:45Z Status: Won't Fix Importance: Undecided Launchpad Issue: lp1060082 Attachments: [playlist.m3u and mixxx.log](https://bugs.launchpad.net/bugs/1060082/+attachment/3364381/+files/playlist.m3u and mixxx.log)


This is probably a revival of Bug #⁠911461.

Hardware: Lenovo Thinkpad Edge, AMD Turion(tm) II Neo K625 Dual-Core Processor, 4GB RAM
Audio device: ATI Technologies Inc SBx00 Azalia (Intel HDA) (rev 40)
VGA compatible controller: ATI Technologies Inc M880G [Mobility Radeon HD 4200]
OS: Debian Stable; Kernel: Linux 2.6.32-5-686-bigmem mixxxdj/mixxx#4910 SMP Sun May 6 04:39:05 UTC 2012 i686 GNU/Linux; all file systems are ext3
Mixx: built from sources on 2012-09-27; rev.: 1.11.0-beta2-pre

Problem description:

I'm experiencing a problem similar to the one described in Bug #⁠911461: files and folders with (some) accented characters are not found when scanning the library or when browsing. Apparently those file names contain characters that are invalid in UTF-8 (that's what Nautilus says). I assume they are leftovers from pre-UTF-8 times. However, the file names display correctly in xterm and my music library and my playlists work fine with other players.

Attachments:

- bug.tar
  - playlist.m3u : playlist for one folder from my library; mixxx only displays titles 06, 07, 10, 11, 12, 14, 18. This is regardless of whether I import the playlist or browse the directory
  - mixxx.log : I removed the existing log file, opened mixxx and browsed to the directory containing the files listed in playlist.m3u

When opening the directory in Nautilus I get the following messages on the console:

[...]
(nautilus:4299): Gtk-WARNING **: Failed to set text from markup due to error parsing markup: Error on line 1 char 109: Invalid UTF-8 encoded text in name - not valid '17. Francisco Canaro - Ernesto Fam\xe1 - No Hay Tierra Como La M\xeda.mp3'

(nautilus:4299): Gtk-WARNING **: Failed to set text from markup due to error parsing markup: Error on line 1 char 109: Invalid UTF-8 encoded text in name - not valid '17. Francisco Canaro - Ernesto Fam\xe1 - No Hay Tierra Como La M\xeda.mp3'
[...]

The invalid characters are displayed as (?) in the Nautilus browser. I should mention that I don't actually use Gnome, Nautilus, etc.; I just opened it out of curiosity. The file names display perfectly fine in (i.e. "\xe1" is an accented a) in xterm and other audio players I'm using.
mixxxbot commented 2 years ago

Commented by: jtmuehlberg Date: 2012-10-02T09:25:45Z Attachments: [playlist.m3u and mixxx.log](https://bugs.launchpad.net/mixxx/+bug/1060082/+attachment/3364381/+files/playlist.m3u and mixxx.log)

mixxxbot commented 2 years ago

Commented by: daschuer Date: 2012-10-09T20:20:57Z


Here are some background informations: http://www.nslu2-linux.org/wiki/HowTo/MountFATFileSystems

I can reproduce a similar problem when I mount a FAT16 USB stick with:
sudo mount -o iocharset=iso8859-2 /dev/sdf1 /mnt create a folder with non ascii characters and mount it again with sudo umount /mnt sudo mount -o codepage=850 /dev/sdf1 /mnt

mixxxbot commented 2 years ago

Commented by: jtmuehlberg Date: 2012-10-09T20:45:24Z


Just to mention: My file systems are currently all ext3. Yet, it's likely that those files have been on a FAT medium in the past.

mixxxbot commented 2 years ago

Commented by: daschuer Date: 2012-10-10T09:46:35Z


After debugging through the Qt source, things getting clear:

Qt stores all strings encoded in 16 bit unicode. QDir and his siblings convert the Utf8 strings, returned by unix readdir to Utf-16. A faulty Latin-1 encoded character is lost during this conversion. So when Qt converts it back to Utf-8 for opening a file the string is not the same and it will fail.

All Gtk based application are not suffering this problem because it uses Utf-8 strings internally. There is no need for conversion so the faulty Latin-1 character survives.

This issue can only be solved by Mixxx, if we code native around the platform independent Qt Code. For me this is a too big effort for a rare file system fault.

@ Jan Tobias, I think simply renaming the effected files is the best solution. If there are a lot of files effected, you can try mount the partition with the right character encoding and copy them to an other Utf-8 partition.

mixxxbot commented 2 years ago

Commented by: jtmuehlberg Date: 2012-10-10T10:15:57Z


Oh well... The problem won't be to rename those files but to keep my existing playlists working. There are about 10k files in the library I want to use. Mixxx displays only 5k of those. I guess I'll have to write a script that renames those files and updates my .m3u playlists accordingly. Looks like I won't try out mix in the immediate future :-(

mixxxbot commented 2 years ago

Commented by: daschuer Date: 2012-10-10T11:02:07Z


Hi Jan Tobias,

.m3u playlists are windows-1252 coded. So there should be no error with them after renaming a file with the same name.

So if you rename:  
Francisco Canaro - Ernesto Fam\xe1 - No Hay Tierra Como La M\xeda.mp3
which is displayed in Nautilus  like:  
Francisco Canaro - Ernesto Fam� - No Hay Tierra Como La M�a.mp3 
to 
Francisco Canaro - Ernesto Famá - No Hay Tierra Como La Mía.mp3
you m3u playlists should still work.  

If your files are on a separate partition, you can also give sudo mount -o iocharset=iso8859-1 a try.

Kind regards,

Daniel

mixxxbot commented 2 years ago

Commented by: daschuer Date: 2012-10-10T11:30:37Z


I have just found the command line tool convmv it can be installed from the ubuntu repositories sudo apt-get install convmv

I have successful renamed the file with:  
convmv -t utf-8 -f latin-1 --notest Francisco\ Canaro\ -\ Ernesto\ Fam�\ -\ No\ Hay\ Tierra\ Como\ La\ M�a.mp3

add -r for recursively go through directories.

Please backup first!

mixxxbot commented 2 years ago

Commented by: rryan Date: 2012-10-10T15:01:27Z


Nice job debugging this case Daniel! -- this might be worth filing a bug with Qt to see what they say.

mixxxbot commented 2 years ago

Issue closed with status Won't Fix.