Closed sandersantema closed 3 years ago
What I should've mentioned is that I've run the following command on the database to ensure paths are correct on linux:
sqlite3 ~/.config/beets/library.db "UPDATE items SET path = replace(path, '/Users/sandersantema', '/home/sandersantema');"
Some more info:
Anthony Parasole at Dekmantel Festival Sa\xcc\x83o Paulo 2017-317327953.mp3
Anthony Parasole at Dekmantel Festival S\xc3\xa3o Paulo 2017-317327953.mp3
As you can see macOS does indeed seem to prefer the decomposed version while Linux prefers the composed version. When I try to execute beet show
using the database with the problematic song imported on Linux it fails on macOS for this newly imported song. So it seems the way songs containing accents and such are stored on macOS and Linux are not compatible.
Some more info:
* Filename when imported on macOS: `Anthony Parasole at Dekmantel Festival Sa\xcc\x83o Paulo 2017-317327953.mp3` * Filename when imported on Linux: `Anthony Parasole at Dekmantel Festival S\xc3\xa3o Paulo 2017-317327953.mp3`
As you can see macOS does indeed seem to prefer the decomposed version while Linux prefers the composed version. When I try to execute
beet show
using the database with the problematic song imported on Linux it fails on macOS for this newly imported song. So it seems the way songs containing accents and such are stored on macOS and Linux are not compatible.
Linux doesn't "prefer" any specific unicode normalization: On Linux filesystems, paths can be any byte string, and do not even need to be valid Unicode. As far as I know, beets also doesn't alter the Unicode representation at all, it just uses whatever it receives from its metadata sources/the filesystem. So the issue might actually be outside of beets. It would be helpful to know more details about what you did: Which filesystem are involved (HFS+ apparently forces (something similar to) NFD: https://en.wikipedia.org/wiki/HFS_Plus)? Is this the same filesystem mounted on linux and Mac, or was the music copied in between? With which tool? Might that have changed the filename encoding?
beet list paulo works fine on both systems
This only operates on the database, not the media files.
after importing the file again on linux (I've copied the library from macOS before and plan to synchronize the databases) works fine, although beets doesn't detect it as the same file.
I'm not quite following what you did here and what your goal is. It would be good to know what you're trying to achieve in the end; it might turn out that it can't really be done with beets, though.
Unfortunately, debugging encoding problems can be really, really hard. The thing to know is that, on Linux, your filenames are truly just raw bytes—no encoding is preferred or enforced, as @wisp3rwind mentioned. So beets is attempting to access those files with an exact sequence of bytes. If those don't match, then the file won't be found.
Doing some digging into how you got the "wrong" bytes in your beets database would be useful. In particular, if you imported your files on macOS and then manually modified the database to make it work on Linux, I can see how that would create problems because the two OSes actually use different filenames (i.e., different sequences of bytes that look the same to humans when rendered as Unicode) for the same files.
In the end my goals is to use beets as an interface or so to speak glue between my music library and tools on my Linux machine and on my macOS machine. Although I exclusively use the Linux machine for day to day use I'm still stuck with macOS for DJ'ing, my hardware depends on DJ software called Traktor which in turn depends on iTunes. For now I've hooked up iTunes to beets by way of hooks which trigger applescripts. In the end I might do away with iTunes all together in favor of https://github.com/16pierre/traktorBeetsIntegration although I'm not ready for that yet because I still sync music to my iPhone using iCloud Music.
The music files themselves are synced by syncthing. On the Linux machine I use the XFS filesystem on macOS APFS. Do you know of any way
In particular, if you imported your files on macOS and then manually modified the database to make it work on Linux, I can see how that would create problems because the two OSes actually use different filenames
This is exactly what I did.
It seems like this issue might be quite hard to resolve and particular to a use case which might be out of scope for beets, so instead I might simply try and rename all the offending filenames given that I don't actually use those for anything and only identify music by metadata. The challenge would then be to come up with a robust way of renaming any file that might cause trouble.
There might be a possible solution however, although I can't completely assess how good it would be in regards to edge cases and such and it would probably require quite a lot of work. Since the problematic characters both represent the same character, ã ("a" with a tile) a\xcc\x83
and \xc3\xa3
it might be feasible for beets to consider the unicode bytes as synonyms. This would however require that this goes for all problematic characters and there are no ambiguities.
One option you might consider would be to use ASCII filenames (the asciify_paths
config option), which are less likely to trigger cross-platform encoding problems.
So you're syncing the music only in one direction, namely Linux -> Mac, and these scripts trigger a library re-scan in iTunes? Then, maybe, your scripts would be able to normalize the paths for APFS before trying to access any files?
It seems like this issue might be quite hard to resolve and particular to a use case which might be out of scope for beets, so instead I might simply try and rename all the offending filenames given that I don't actually use those for anything and only identify music by metadata. The challenge would then be to come up with a robust way of renaming any file that might cause trouble.
Or, much more simply, the asciify
/asciify_path
options might solve your problem?
asciify_paths
seems like a great solution! Thanks @sampsyo and @wisp3rwind.
So you're syncing the music only in one direction, namely Linux -> Mac, and these scripts trigger a library re-scan in iTunes? Then, maybe, your scripts would be able to normalize the paths for APFS before trying to access any files?
I think that's what I'd achieve using the asciify_paths
option right?
For the interested this is what I'm doing exactly:
hook:
hooks:
- event: after_write
command: osascript /Users/sandersantema/.config/beets/refresh.scpt "{item.path}"
- event: write
command: echo "{item.path}"
- event: item_removed
command: osascript /Users/sandersantema/.config/beets/remove.scpt "{item.path}"
- event: item_removed
command: mv "{item.path}" /Users/sandersantema/.config/beets/trash
- event: item_moved
command: echo "{source}" "{destination}"
- event: item_moved
command: osascript /Users/sandersantema/.config/beets/move.scpt "{source}" "{destination}"
Scripts:
move.scpt
remove.scpt
refresh.scpt
One nice thing is that as you can see here I don't need a add.scpt
because if a song which isn't added to the iTunes library is refreshed such as is done in refresh.scpt
it is added to the library. If you'd want to use these scripts together with apple's new Music app simply replace iTunes with Music in every file.
asciify_paths
seems like a great solution! Thanks @sampsyo and @wisp3rwind.So you're syncing the music only in one direction, namely Linux -> Mac, and these scripts trigger a library re-scan in iTunes? Then, maybe, your scripts would be able to normalize the paths for APFS before trying to access any files?
I think that's what I'd achieve using the
asciify_paths
option right?
Yes; it might be more drastic (but also simpler and maybe more reliable) than the normalization that you'd minimally need.
I've got a file with the filename
Anthony Parasole at Dekmantel Festival São Paulo 2017-317327953.mp3
this works perfectly fine on macOS but doesn't on linux. I believe this has something to do with linux expecting composed unicode file encodings while macOS can handle both but prefers decomposed i.e.a\xcc\x83
as can be seen below. I believe$LANG
and$LC_ALL
might be relevant to this issue as well, however these are the same on both my macOS and Linux machines namelyen_US.UTF-8
, manually testing usingenv LANG=en_US.UTF-8
causes the same error. I think this is a known issue but I'm a bit in over my head as to how to actually resolve it, if I've missed any info please let me know. The best solution to this might be to simply sanitize all filenames on macOS but I do wonder whether this is a future proof solution given that I'd have to keep on manually sanitizing filenames before importing in the future on Linux.Some more info I found out later:
beet list paulo
works fine on both systemsProblem
Running this command (verbose mode doesn't provide any more relevant info):
While on macOS the command works just fine:
Setup
5.12.0-2
3.8.9
and3.9.4
on Linux works fine using3.9.5
on macOSmaster
branch version