Closed GoogleCodeExporter closed 9 years ago
Looks similar...
[root@ct183120 beets]# beet import /var/downloads/"Arcade Fire - Funeral"
Traceback (most recent call last):
File "/usr/bin/beet", line 9, in <module>
load_entry_point('beets==1.0b4', 'console_scripts', 'beet')()
File "/usr/lib/python2.6/site-packages/beets-1.0b4-py2.6.egg/beets/ui/__init__.py", line 439, in main
subcommand.func(lib, config, suboptions, subargs)
File "/usr/lib/python2.6/site-packages/beets-1.0b4-py2.6.egg/beets/ui/commands.py", line 533, in import_func
opts.logpath, art, threaded, color)
File "/usr/lib/python2.6/site-packages/beets-1.0b4-py2.6.egg/beets/ui/commands.py", line 478, in import_files
pl.run_parallel()
File "/usr/lib/python2.6/site-packages/beets-1.0b4-py2.6.egg/beets/ui/pipeline.py", line 94, in run
msg = self.coro.next()
File "/usr/lib/python2.6/site-packages/beets-1.0b4-py2.6.egg/beets/ui/commands.py", line 319, in read_albums
for path, items in autotag.albums_in_dir(os.path.expanduser(toppath)):
File "/usr/lib/python2.6/site-packages/beets-1.0b4-py2.6.egg/beets/autotag/__init__.py", line 116, in albums_in_dir
for root, dirs, files in _sorted_walk(path):
File "/usr/lib/python2.6/site-packages/beets-1.0b4-py2.6.egg/beets/autotag/__init__.py", line 90, in _sorted_walk
base = library._unicode_path(base)
File "/usr/lib/python2.6/site-packages/beets-1.0b4-py2.6.egg/beets/library.py", line 167, in _unicode_path
return path.decode(sys.getfilesystemencoding())
File "/usr/lib/python2.6/encodings/utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode bytes in position 22-24: invalid
data
Original comment by dleink
on 4 Aug 2010 at 2:07
Grr! This sort of thing is supposed to be fixed. Thanks for the report -- any
chance I could get your help diagnosing what's going on? It's not something I
can reproduce here.
If you get a chance, open up a Python shell and run:
>>> import os
>>> os.listdir(u"/var/downloads/Arcade Fire - Funeral")
And then maybe even:
>>> from beets import library
>>> library._unicode_path("/var/downloads/Arcade Fire - Funeral")
And:
>>> os.listdir(library._unicode_path("/var/downloads/Arcade Fire - Funeral")
If you let me know what these commands output, I may be able to get a better
handle on what's going on here.
For the record, these things are the things that seem inconsistent:
* os.listdir() is supposed to give Unicode output when given Unicode input, and
I'm careful to always give it Unicode input. Therefore, the call "base =
library._unicode_path(base)" shouldn't do any encoding.
* I'm decoding a path, which came from the filesystem, using the filesystem
encoding. This should never cause an error -- either the filesystem is lying
about which encoding it uses or it's giving us corrupt filenames.
Original comment by adrian.sampson
on 4 Aug 2010 at 5:06
Python 2.6.5 (r265:79063, Apr 9 2010, 15:16:58)
[GCC 4.1.2 20080704 (Red Hat 4.1.2-48)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import os
>>> os.listdir(u"/var/downloads/Arcade Fire - Funeral")
[u'Funeral (.m3u).m3u', '02 Neighborhood #2 (La\xefka).flac', '03 Une Ann\xe9e
Sans Lumi\xe8re.flac', u'Funeral (log).log', u'06 Crown of Love.flac', u'04
Neighborhood #3 (Power Out).flac', u'07 Wake Up.flac', u'09 Rebellion
(Lies).flac', '08 Ha\xefti.flac', u'05 Neighborhood #4 (7 Kettles).flac', u'01
Neighborhood #1 (Tunnels).flac', u'10 In the Backseat.flac', u'Funeral
(.cue).cue']
>>> from beets import library
>>> library._unicode_path("/var/downloads/Arcade Fire - Funeral")
u'/var/downloads/Arcade Fire - Funeral'
>>> os.listdir(library._unicode_path("/var/downloads/Arcade Fire - Funeral")
... )
[u'Funeral (.m3u).m3u', '02 Neighborhood #2 (La\xefka).flac', '03 Une Ann\xe9e
Sans Lumi\xe8re.flac', u'Funeral (log).log', u'06 Crown of Love.flac', u'04
Neighborhood #3 (Power Out).flac', u'07 Wake Up.flac', u'09 Rebellion
(Lies).flac', '08 Ha\xefti.flac', u'05 Neighborhood #4 (7 Kettles).flac', u'01
Neighborhood #1 (Tunnels).flac', u'10 In the Backseat.flac', u'Funeral
(.cue).cue']
>>>
Original comment by dleink
on 4 Aug 2010 at 5:31
Awesome, thanks! That helped a lot.
Unfortunately, the problem of "undecodable paths" has opened up an enormous,
horrible can of worms about filesystem encodings and Unicode. I still need to
figure out exactly how I'm going to address the issue, because handling
malformed filenames internally will add a *lot* of complexity to the core of
beets.
For the time being, though, I've just pushed a new revision that just ignores
filenames that can't be decoded. This is, of course, not a very good solution
but at least the tagger won't completely crash partway through...
I'll let you know when I have a better answer.
Original comment by adrian.sampson
on 4 Aug 2010 at 7:15
Issue 81 has been merged into this issue.
Original comment by adrian.sampson
on 4 Aug 2010 at 7:16
Original comment by adrian.sampson
on 4 Aug 2010 at 7:16
Okay! I just pushed a few changes that make beets handle paths as opaque
bytestrings (rather than Unicode) end-to-end. With any luck, that should make
this problem just go away!
Sorry for using you as a guinea pig, dleink, but any chance you could check out
the latest version and see if this fixes everything?
Original comment by adrian.sampson
on 5 Aug 2010 at 8:42
Looking good on that Arcade Fire album, let's see how it goes with the rest of
the library...
Original comment by dleink
on 5 Aug 2010 at 9:33
Original comment by adrian.sampson
on 5 Aug 2010 at 10:53
Can't quite figure which album is causing this..
File "/usr/bin/beet", line 9, in <module>
load_entry_point('beets==1.0b4', 'console_scripts', 'beet')()
File "/usr/lib/python2.6/site-packages/beets-1.0b4-py2.6.egg/beets/ui/__init__.py", line 439, in main
subcommand.func(lib, config, suboptions, subargs)
File "/usr/lib/python2.6/site-packages/beets-1.0b4-py2.6.egg/beets/ui/commands.py", line 552, in import_func
opts.logpath, art, threaded, color)
File "/usr/lib/python2.6/site-packages/beets-1.0b4-py2.6.egg/beets/ui/commands.py", line 497, in import_files
pl.run_parallel()
File "/usr/lib/python2.6/site-packages/beets-1.0b4-py2.6.egg/beets/ui/pipeline.py", line 179, in run
self.coro.send(msg)
File "/usr/lib/python2.6/site-packages/beets-1.0b4-py2.6.egg/beets/ui/commands.py", line 456, in apply_choices
albuminfo.set_art(artpath)
File "/usr/lib/python2.6/site-packages/beets-1.0b4-py2.6.egg/beets/library.py", line 1142, in set_art
self.artpath = artdest
File "/usr/lib/python2.6/site-packages/beets-1.0b4-py2.6.egg/beets/library.py", line 1048, in __setattr__
self._library.conn.execute(sql, (value, self.id))
sqlite3.ProgrammingError: You must not use 8-bit bytestrings unless you use a
text_factory that can interpret 8-bit bytestrings (like text_factory = str). It
is highly recommended that you instead just switch your application to Unicode
strings.
Original comment by dleink
on 6 Aug 2010 at 12:51
Another one that looks related..
Traceback (most recent call last):
File "/usr/bin/beet", line 9, in <module>
load_entry_point('beets==1.0b4', 'console_scripts', 'beet')()
File "/usr/lib/python2.6/site-packages/beets-1.0b4-py2.6.egg/beets/ui/__init__.py", line 439, in main
subcommand.func(lib, config, suboptions, subargs)
File "/usr/lib/python2.6/site-packages/beets-1.0b4-py2.6.egg/beets/ui/commands.py", line 552, in import_func
opts.logpath, art, threaded, color)
File "/usr/lib/python2.6/site-packages/beets-1.0b4-py2.6.egg/beets/ui/commands.py", line 497, in import_files
pl.run_parallel()
File "/usr/lib/python2.6/site-packages/beets-1.0b4-py2.6.egg/beets/ui/pipeline.py", line 179, in run
self.coro.send(msg)
File "/usr/lib/python2.6/site-packages/beets-1.0b4-py2.6.egg/beets/ui/commands.py", line 444, in apply_choices
item.move(lib, True)
File "/usr/lib/python2.6/site-packages/beets-1.0b4-py2.6.egg/beets/library.py", line 318, in move
dest = library.destination(self)
File "/usr/lib/python2.6/site-packages/beets-1.0b4-py2.6.egg/beets/library.py", line 834, in destination
subpath = _sanitize_path(subpath)
File "/usr/lib/python2.6/site-packages/beets-1.0b4-py2.6.egg/beets/library.py", line 194, in _sanitize_path
comp = regex.sub(repl, comp)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 10:
ordinal not in range(128)
Original comment by dleink
on 6 Aug 2010 at 3:10
Thank you! These were both really helpful; I needed to tie up a couple of loose
ends. Commit af9f45480ece should fix the first error (having to do with album
art paths). Commit 567af055c562 should fix the second (the Unicode error in
_sanitize_path). Sorry for the rocky road to stability...
Original comment by adrian.sampson
on 6 Aug 2010 at 5:03
Original comment by adrian.sampson
on 10 Aug 2010 at 5:55
I'm afraid this issue is back with a vengeance in 1.0b6
{{{switch:torrent daenney$ locale
LANG="en_GB.UTF-8"
LC_COLLATE="en_GB.UTF-8"
LC_CTYPE="en_GB.UTF-8"
LC_MESSAGES="en_GB.UTF-8"
LC_MONETARY="en_GB.UTF-8"
LC_NUMERIC="en_GB.UTF-8"
LC_TIME="en_GB.UTF-8"
LC_ALL="en_GB.UTF-8"}}}
{{{switch:torrent daenney$ python -c 'import sys; print
sys.getfilesystemencoding()'
utf-8}}}
This happens when the following file is encountered:
8 - Fallen Snow (Skatebård Remix).mp3
{{{Traceback (most recent call last):
File "/usr/local/bin/beet", line 9, in <module>
load_entry_point('beets==1.0b6', 'console_scripts', 'beet')()
File "/Library/Python/2.6/site-packages/beets-1.0b6-py2.6.egg/beets/ui/__init__.py", line 457, in main
subcommand.func(lib, config, suboptions, subargs)
File "/Library/Python/2.6/site-packages/beets-1.0b6-py2.6.egg/beets/ui/commands.py", line 617, in import_func
opts.logpath, art, threaded, color, delete, quiet)
File "/Library/Python/2.6/site-packages/beets-1.0b6-py2.6.egg/beets/ui/commands.py", line 559, in import_files
pl.run_parallel()
File "/Library/Python/2.6/site-packages/beets-1.0b6-py2.6.egg/beets/ui/pipeline.py", line 94, in run
msg = self.coro.next()
File "/Library/Python/2.6/site-packages/beets-1.0b6-py2.6.egg/beets/ui/commands.py", line 343, in read_albums
for path, items in autotag.albums_in_dir(os.path.expanduser(toppath)):
File "/Library/Python/2.6/site-packages/beets-1.0b6-py2.6.egg/beets/autotag/__init__.py", line 122, in albums_in_dir
i = library.Item.from_path(os.path.join(root, filename))
File "/Library/Python/2.6/site-packages/beets-1.0b6-py2.6.egg/beets/library.py", line 254, in from_path
i.read(path)
File "/Library/Python/2.6/site-packages/beets-1.0b6-py2.6.egg/beets/library.py", line 319, in read
f = MediaFile(_syspath(read_path))
File "/Library/Python/2.6/site-packages/beets-1.0b6-py2.6.egg/beets/mediafile.py", line 490, in __init__
self.mgfile = mutagen.File(path)
File "/Library/Python/2.6/site-packages/mutagen-1.20-py2.6.egg/mutagen/__init__.py", line 203, in File
fileobj = file(filename, "rb")
IOError: [Errno 2] No such file or directory: '/Volumes/data/unsorted/The Bird
of Music/The Bird of Music Remixes/8 - Fallen Snow (Skateba\xcc\x8ard
Remix).mp3'}}}
Another example:
03 Ultraviolence (Château Marmont Remix) 1.mp3
{{{Traceback (most recent call last):
File "/usr/local/bin/beet", line 9, in <module>
load_entry_point('beets==1.0b6', 'console_scripts', 'beet')()
File "/Library/Python/2.6/site-packages/beets-1.0b6-py2.6.egg/beets/ui/__init__.py", line 457, in main
subcommand.func(lib, config, suboptions, subargs)
File "/Library/Python/2.6/site-packages/beets-1.0b6-py2.6.egg/beets/ui/commands.py", line 617, in import_func
opts.logpath, art, threaded, color, delete, quiet)
File "/Library/Python/2.6/site-packages/beets-1.0b6-py2.6.egg/beets/ui/commands.py", line 559, in import_files
pl.run_parallel()
File "/Library/Python/2.6/site-packages/beets-1.0b6-py2.6.egg/beets/ui/pipeline.py", line 94, in run
msg = self.coro.next()
File "/Library/Python/2.6/site-packages/beets-1.0b6-py2.6.egg/beets/ui/commands.py", line 343, in read_albums
for path, items in autotag.albums_in_dir(os.path.expanduser(toppath)):
File "/Library/Python/2.6/site-packages/beets-1.0b6-py2.6.egg/beets/autotag/__init__.py", line 122, in albums_in_dir
i = library.Item.from_path(os.path.join(root, filename))
File "/Library/Python/2.6/site-packages/beets-1.0b6-py2.6.egg/beets/library.py", line 254, in from_path
i.read(path)
File "/Library/Python/2.6/site-packages/beets-1.0b6-py2.6.egg/beets/library.py", line 319, in read
f = MediaFile(_syspath(read_path))
File "/Library/Python/2.6/site-packages/beets-1.0b6-py2.6.egg/beets/mediafile.py", line 490, in __init__
self.mgfile = mutagen.File(path)
File "/Library/Python/2.6/site-packages/mutagen-1.20-py2.6.egg/mutagen/__init__.py", line 203, in File
fileobj = file(filename, "rb")
IOError: [Errno 2] No such file or directory:
'/Volumes/data/unsorted/HeartsRevolution- Ultraviolence/03 Ultraviolence
(Cha\xcc\x82teau Marmont Remix) 1.mp3'}}}
This would be similar to the issue 81.
Original comment by daniele.sluijters
on 31 Jan 2011 at 12:45
And another:
Sexy Sushi - Tu l'as bien mérité
This results in the following beet command:
beet import Sexy\ Sushi\ -\ Tu\ l\'as\ bien\ me\314rite\314\/
Basically, the terminal returns immediately and nothing happens.
Unfortunately, this seems to be a problem that only concerns Mac OS X and Samba:
When I try to cd on the original filesystem into that directory the command is
completed like this:
cd Sexy\ Sushi\ -\ Tu\ l\'as\ bien\ mérité
The beet import command results in:
beet import Sexy\ Sushi\ -\ Tu\ l\'as\ bien\ mérité
Just did a little more research about UTF-8 and Samba:
Although Mac OS X uses UTF-8 as its encoding method for filenames, it uses an
extended UTF-8 specification that Samba cannot handle, so UTF-8 locale is not
available for Mac OS X.
Basically this means that over Samba + Mac OS UTF-8 filenames apparently cannot
be handled which is causing my issues above since it apparently then uses CP850.
I have no idea if there is any way we can check for this during import or fix
it somehow but I doubt it, it looks like AFP is exempt from this problem.
Original comment by daniele.sluijters
on 31 Jan 2011 at 1:02
If I understand you correctly, this problem is manifesting only when you try to
import files that both (a) contain non-ASCII characters and (b) are located on
a remote SMB share mounted on Mac OS X. Is that correct? If so, this is
definitely a separate issue from the "undecodable filenames" problem, which
occurs when the filenames are non-UTF8 even though they purport to be. We
should open a separate ticket in that case.
It would also be helpful to see the sources you mentioned that talk about UTF-8
support in Mac OS X's Samba -- then I might be able to see if there's a
workaround for this limitation.
Original comment by adrian.sampson
on 31 Jan 2011 at 4:51
Original issue reported on code.google.com by
dleink
on 4 Aug 2010 at 2:23