dcantrell / pyparted

Python bindings for GNU parted (libparted)
GNU General Public License v2.0
85 stars 42 forks source link

UnicodeDecodeError exception due to invalid 0x92 ASCII character in disk device name #105

Closed okh-mzny closed 9 months ago

okh-mzny commented 9 months ago

Coming from this https://github.com/archlinux/archinstall/issues/2111 archinstall issue.

Hey there, I have tried to use the archinstall script on a Dell Wyse 3040 thin client. Archinstall uses pyparted under the hood.

Some info on the platform: https://www.parkytowers.me.uk/thin/wyse/3040/

The most important point is that the system includes 8/16GB of soldered eMMC flash. Its name is "H8G4a" but the MMC subsystem under Linux returns the following:

root@archiso ~ # cat /sys/class/block/mmcblk0/device/name
H8G4a▒
root@archiso ~ # cat /sys/class/block/mmcblk0/device/name | hexdump -C
00000000  48 38 47 34 61 92 0a                              |H8G4a..|

This odd name causes pyparted to throw an excpetion, as its not expecting the 0x92 character

Traceback (most recent call last):
  File "/usr/bin/archinstall", line 5, in <module>
    from archinstall import run_as_a_module
  File "/usr/lib/python3.11/site-packages/archinstall/__init__.py", line 11, in <module>
    from .lib import disk
  File "/usr/lib/python3.11/site-packages/archinstall/lib/disk/__init__.py", line 1, in <module>
    from .device_handler import device_handler, disk_layouts
  File "/usr/lib/python3.11/site-packages/archinstall/lib/disk/device_handler.py", line 664, in <module>
    device_handler = DeviceHandler()
                     ^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/archinstall/lib/disk/device_handler.py", line 39, in __init__
    self.load_devices()
  File "/usr/lib/python3.11/site-packages/archinstall/lib/disk/device_handler.py", line 71, in load_devices
    device_info = _DeviceInfo.from_disk(disk)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/archinstall/lib/disk/device_model.py", line 446, in from_disk
    model=device.model.strip(),
          ^^^^^^^^^^^^
  File "/usr/lib/python3.11/site-packages/parted/device.py", line 69, in model
    print([self.__device, self.__device.model])
                          ^^^^^^^^^^^^^^^^^^^
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x92 in position 9: invalid start byte

The emmc subsystem gets its name from the so-called "CID" or Card IDentification registers as defined in the SD/MMC standard. It is defined as a 5 Byte ASCII string so my name is supposed to be H8G4a but it is longer.. For some reason the emmc subsystem under linux returns more data and includes these 0x92 and 0x0a bytes that causes the issue. I can find 0x92 in my CID but not sure where 0x0a is coming from.

Is there something we could / should do in pyparted to fix these invalid character errors? Make the script ignore or remove invalid characters?

I'm looking forward to input on this.

Cheers.

okh-mzny commented 9 months ago

Noticed this issue was already created, whoops.. https://github.com/dcantrell/pyparted/issues/76

Closing..