whipper-team / whipper

Python CD-DA ripper preferring accuracy over speed
GNU General Public License v3.0
1.15k stars 91 forks source link

Unable to rip CD with extremely long title #453

Open the-confessor opened 4 years ago

the-confessor commented 4 years ago

I've run into a problem ripping a CD with an extremely long title.

Doing some digging I noticed an existing issue #197 - it says milestone is 1.0 but in whipper 0.9.0 there seems to already be some kind of truncate_filename method introduced. So I am wondering if that fix is in there already, but just not working in this scenario?

The error I encounter:

Track 13 finished, found 21 Q sub-channels with CRC errors
Track 14 finished, found 26 Q sub-channels with CRC errors
Traceback (most recent call last):
  File "/usr/lib64/python3.6/site-packages/whipper/extern/task/task.py", line 518, in c
    callable_task(*args, **kwargs)
  File "/usr/lib64/python3.6/site-packages/whipper/program/cdrdao.py", line 111, in _read
    self._done()
  File "/usr/lib64/python3.6/site-packages/whipper/program/cdrdao.py", line 154, in _done
    os.makedirs(t_dirn, exist_ok=True)
  File "/usr/lib64/python3.6/os.py", line 220, in makedirs
    mkdir(name, mode)
OSError: [Errno 36] File name too long: "/media/music/Soulwax - Most of the remixes we've made for other people over the years except for the one for Einstürzende Neubauten because we lost it and a few we didn't think sounded good enough or just didn't fit in length-wise, but including some that are hard to find because either people forgot about them or just simply because they haven't been released yet, a few we really love, one we think is just ok, some we did for free, some we did for money, some for ourselves without permission and some for friends as swaps but never on time and always at our studio in Ghent. - Unmixed"
CRITICAL:whipper.command.main:exception OSError at /usr/lib64/python3.6/os.py:220: makedirs(): [Errno 36] File name too long: "/media/music/Soulwax - Most of the remixes we've made for other people over the years except for the one for Einstürzende Neubauten because we lost it and a few we didn't think sounded good enough or just didn't fit in length-wise, but including some that are hard to find because either people forgot about them or just simply because they haven't been released yet, a few we really love, one we think is just ok, some we did for free, some we did for money, some for ourselves without permission and some for friends as swaps but never on time and always at our studio in Ghent. - Unmixed"
Traceback (most recent call last):
  File "/usr/lib64/python3.6/site-packages/whipper/extern/task/task.py", line 518, in c
    callable_task(*args, **kwargs)
  File "/usr/lib64/python3.6/site-packages/whipper/program/cdrdao.py", line 111, in _read
    self._done()
  File "/usr/lib64/python3.6/site-packages/whipper/program/cdrdao.py", line 154, in _done
    os.makedirs(t_dirn, exist_ok=True)
  File "/usr/lib64/python3.6/os.py", line 220, in makedirs
    mkdir(name, mode)
OSError: [Errno 36] File name too long: "/media/music/Soulwax - Most of the remixes we've made for other people over the years except for the one for Einstürzende Neubauten because we lost it and a few we didn't think sounded good enough or just didn't fit in length-wise, but including some that are hard to find because either people forgot about them or just simply because they haven't been released yet, a few we really love, one we think is just ok, some we did for free, some we did for money, some for ourselves without permission and some for friends as swaps but never on time and always at our studio in Ghent. - Unmixed"

### looking for pattern: INFO:whipper\.image\.cue:parsing \.cue file ['"].*.cue['"] in /home/ed/logs/audio-rip/rip-2020-01-18T13-10-22.log
### result: success (but couldn't find output folder for further processing)
JoeLametta commented 4 years ago

@the-confessor Hi, I've just commited an untested fix in 151f6c8a4f30182ff7735198ec19ae61b8573c38 (in a custom branch): could you try if that one fixes your issue?

Thanks, Joe

the-confessor commented 4 years ago

I applied that patch and the result is the same - exact same stack trace.

I notice that the patch is in cd.py, which doesn't appear in the stack trace.

the-confessor commented 4 years ago

Update - not only does it not fix the issue, it seems to have other side effects, e.g. here is the output of another CD rip with that patch applied:

MGMT -.cue
MGMT -.m3u
MGMT - Oracular.flac
MGMT - Oracular Spectacular - 06 - 4th Dimensional.flac
MGMT - Oracular Spectacular - 08 - Of Moons, Birds &.flac
MGMT -.score
MGMT -.toc

My templates:

track_template="%A - %d/%A - %d - %t - %n"
disc_template="%A - %d/%A - %d"
JoeLametta commented 4 years ago

Thanks for the reply. I've discovered that the shrinkPath() method we're using is quite useless. A comprehensive way to verify a path doesn't exceed any limit in Python would be this:

import os

def check_path_length(path):
    """
    Check whether the path and its components respect the filesystem limits.

    This method assumes the provided path has been normalized.

    The limits retrieved aren't always reliable as some platforms
    may provide deceptive values.

    NOTE: os.pathconf() is only available on Unix.

    :var path: normalized path
    :type path: str
    :returns: True if length of path and its components respect the filesystem
              limits, False otherwise
    :rtype: bool
    """
    components = path.split(os.sep)
    comp_max_len = len(max(components, key=len).encode())
    fn_lim = pathconf(path.encode(), 'PC_NAME_MAX')
    path_lim = pathconf(path.encode(), 'PC_PATH_MAX')
    return comp_max_len <= fn_lim and len(path.encode()) <= path_lim

I think that the method truncate_filename() works fine but it only applies to the last portion of the path (filename).

What's needed to do to truly address this issue is:

  1. Check that the destination path (without template) respects the limits of the filesystem. If it doesn't, raise a meaningful error.
  2. Implement a function which knowing the basepath (portion before template) and the last path (template + filename) tries to return a shrinked path it in a meaningful way to respect the limits of the filesystem.

For example on Linux it seems that any component of the path can't be longer than 255 bytes so the function should take care of this (shrinking) for any of the components. The limit for a complete path is something like 4096 bytes which, I would say, it's quite difficult to reach for a normal use case.

srussel commented 4 years ago

This is a case that would benefit from #342. Even after truncating to filesystem limits, I don't think I would be happy with a 255 character file or directory name. Some music players may still choke on it and it may be an issue copying to a different filesystem.

the-confessor commented 4 years ago

It is quite a challenging problem, and I agree #342 would at least provide a workaround.

JoeLametta commented 4 years ago

In addition to the filesystem limits check we could add a feature to provide users a way to set more restrictive limits which would allow shrinking the paths even more.