clalancette / pycdlib

Python library to read and write ISOs
GNU Lesser General Public License v2.1
147 stars 38 forks source link

os.walk fails on Shift-JIS encoded ISO-9660 filesystem: Part II - Revenge of the Folders #109

Open einstein95 opened 1 year ago

einstein95 commented 1 year ago

Exact same as last time (see #101) except this time it barfs on directory names that are Shift-JIS encoded (because of course those are as well)

my python code:

import pycdlib

iso = pycdlib.PyCdlib()
iso.open('Gorippa Petit 19.iso')

for f in iso.walk(iso_path='/', encoding='shift_jis'):
    print(f)

The error:

('/', ['JPEG', 'EPS', 'DATA'], ['検索ブラウザ.exe', 'お読みください.txt'])
('/DATA', [], ['macromedia.dxr', 'MENU.DXR', 'KUMI.DXR', 'BUHIN.DXR'])
('/EPS', ['部品', '組合せ'], [])
Traceback (most recent call last):
  File "test.py", line 7, in <module>
    for f in iso.walk(iso_path='/', encoding='shift_jis'):
  File "/lib/python3.10/site-packages/pycdlib/pycdlib.py", line 5916, in walk
    relpath = self.full_path_from_dirrecord(dir_record,
  File "/lib/python3.10/site-packages/pycdlib/pycdlib.py", line 5692, in full_path_from_dirrecord
    names.insert(0, name.decode(encoding))
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x91 in position 0: invalid start byte

The folder name in question is b'\x91g\x8d\x87\x82\xb9' or 組合せ once correctly decoded