whipper-team / whipper

Python CD-DA ripper preferring accuracy over speed
GNU General Public License v3.0
1.11k stars 88 forks source link

Upgrade to python 3.10.10 breaks rip #591

Closed rndmc12 closed 11 months ago

rndmc12 commented 1 year ago

Since my distro (Arch) recently shipped the stable python 3.10.10, rips fail.

Due to the critical exception below, I suspect the error may be related to PEP-624: https://peps.python.org/pep-0624/

Any advice on how to proceed?

Example of fail:


INFO:whipper.command.cd:checking device /dev/sr0
Traceback (most recent call last):
  File "/usr/lib/python3.10/site-packages/whipper/extern/task/task.py", line 523, in c
    callable_task(*args, **kwargs)
  File "/usr/lib/python3.10/site-packages/whipper/program/cdrdao.py", line 115, in _read
    self._done()
  File "/usr/lib/python3.10/site-packages/whipper/program/cdrdao.py", line 153, in _done
    self.toc.parse()
  File "/usr/lib/python3.10/site-packages/whipper/image/toc.py", line 203, in parse
    content = f.readlines()
  File "/usr/lib/python3.10/codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x83 in position 121: invalid start byte
CRITICAL:whipper.command.main:exception UnicodeDecodeError at /usr/lib/python3.10/codecs.py:322: decode(): 'utf-8' codec can't decode byte 0x83 in position 121: invalid start byte
Traceback (most recent call last):
  File "/usr/lib/python3.10/site-packages/whipper/extern/task/task.py", line 523, in c
    callable_task(*args, **kwargs)
  File "/usr/lib/python3.10/site-packages/whipper/program/cdrdao.py", line 115, in _read
    self._done()
  File "/usr/lib/python3.10/site-packages/whipper/program/cdrdao.py", line 153, in _done
    self.toc.parse()
  File "/usr/lib/python3.10/site-packages/whipper/image/toc.py", line 203, in parse
    content = f.readlines()
  File "/usr/lib/python3.10/codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x83 in position 121: invalid start ```
github-actions[bot] commented 1 year ago

👋 Thanks for opening your first issue here! If you're reporting a 🐞 bug, please make sure you include steps to reproduce it. We get a lot of issues on this repo, so please be patient and we will get back to you as soon as we can.

To help make it easier for us to investigate your issue, please follow the contributing instructions.

w0d4 commented 1 year ago

I can approve that. Upgraded to python 3.10.10 a few minutes ago and whipper is not working anymore.

MerlijnWajer commented 1 year ago

I think that changing whipper/image.toc.py to open the file with different encoding will probably help. Instead of:

        with open(self._path) as f:
            content = f.readlines()

maybe try:

        with open(self._path, 'rb') as f:
            content = f.readlines()
MerlijnWajer commented 1 year ago

(This is on line 202 - 203)

elomatreb commented 1 year ago

I was getting this too, it's caused by an uninitialized memory access in cdrdao: https://github.com/cdrdao/cdrdao/pull/21

rndmc12 commented 1 year ago

Thanks for the feedback. With Arch cdrdao was pushed to 1.2.5 recently and whipper works again after downgrading cdrdao to the previous 1.2.4. Same applies in a fedora machine, which ships python 3.10.10 and cdrdao 1.2.4. Looks like a good catch to me.

MerlijnWajer commented 1 year ago

Are you saying that cdrdao breaks only with a recent Python? It seemed to me like we have two separate issues:

  1. whipper assumes the cdrdao file contains valid utf-8
  2. cdrdao broke in some cases
MerlijnWajer commented 1 year ago

But if all the users on this issue use the same linux distro, it sounds like the only breaking change could indeed be cdrdao... :)

elomatreb commented 1 year ago

I don't think this is related to Python specifically. Looking at https://docs.python.org/3.10/whatsnew/changelog.html#python-3-10-10-final, I don't see any relevant changes.

The bug in cdrdao is an undefined behavior bug, which means that the previous build in Arch may just have gotten lucky and not exhibited the problem. It just happened to break the UTF-8 assumption, but I also had runs where it named the output files as some arbitrary shared object file linked into the binary or just segfaulted directly.

rndmc12 commented 1 year ago

Well, I've used whipper a few times after cdrdao got rolled to 1.2.5 without any issues, python was upgraded a month after cdrdao only. So, it is peculiar indeed. That's why I suspected the linked PEP, which is mentioned for 3.10.10, but that was more blind fishing than anything else :) I stick to the downgrade for now and watch for @elomatreb 's merge request; cdrdao got a --no utf-8 option in the last release after all.

boustrophedon commented 1 year ago

I'm also on arch and ran into this issue. Downgrading to cdrdao 1.2.4 works for me.

Inky1003 commented 1 year ago

For those using Arch and too lazy to search for the package:

pacman -U https://archive.archlinux.org/packages/c/cdrdao/cdrdao-1.2.4-2-x86_64.pkg.tar.zst

PS: The link MAY be dead in the future. Be sure to have a local copy of It if that happens.

rndmc12 commented 11 months ago

The Arch packager has thankfully pulled @elomatreb 's patch and shipped a patched cdrdao 1.2.5-2 I have just done one rip with the package and it worked again first try.

rndmc12 commented 11 months ago

I have ripped about a dozen CDs since the update to 1.2.5-2 on Arch and have not noticed issues. Hence, I close this issue as fixed by above linked cdrdao patch. Thanks.

Be-ing commented 9 months ago

dnf downgrade cdrdao works around the bug on Fedora 38.