theopolis / uefi-firmware-parser

Parse BIOS/Intel ME/UEFI firmware related structures: Volumes, FileSystems, Files, etc
Other
780 stars 156 forks source link

`UnicodeDecodeError` when trying to extract UEFI #117

Closed jstucke closed 1 year ago

jstucke commented 1 year ago

I'm getting an UnicodeDecodeError when trying to extract files from a Dell UEFI file with --superbrute:

Traceback (most recent call last):
  File "/usr/bin/uefi-firmware-parser", line 171, in <module>
    superbrute_search(input_data)
  File "/usr/bin/uefi-firmware-parser", line 42, in superbrute_search
    _process_show_extract(parser.parse())
  File "/usr/local/lib/python3.10/site-packages/uefi_firmware/__init__.py", line 81, in parse
    if not self.firmware.process():
  File "/usr/local/lib/python3.10/site-packages/uefi_firmware/pfs.py", line 394, in process
    section.process()
  File "/usr/local/lib/python3.10/site-packages/uefi_firmware/pfs.py", line 283, in process
    raw.process()
  File "/usr/local/lib/python3.10/site-packages/uefi_firmware/base.py", line 186, in process
    self.object = parser.parse()
  File "/usr/local/lib/python3.10/site-packages/uefi_firmware/__init__.py", line 81, in parse
    if not self.firmware.process():
  File "/usr/local/lib/python3.10/site-packages/uefi_firmware/me.py", line 663, in process
    if entry.process():
  File "/usr/local/lib/python3.10/site-packages/uefi_firmware/me.py", line 618, in process
    if manifest.process():
  File "/usr/local/lib/python3.10/site-packages/uefi_firmware/me.py", line 451, in process
    if not self._parse_mods():
  File "/usr/local/lib/python3.10/site-packages/uefi_firmware/me.py", line 373, in _parse_mods
    module = MeModule(
  File "/usr/local/lib/python3.10/site-packages/uefi_firmware/me.py", line 102, in __init__
    self.name = self.structure.Name
  File "/usr/local/lib/python3.10/site-packages/uefi_firmware/base.py", line 30, in name
    name = name.decode("utf-8")
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xec in position 0: invalid continuation byte

fix: either handle the error or use name.decode("utf-8", errors="replace") or name.decode("utf-8", errors="ignore"). Some of the resulting files look kinda garbled, though (e.g. pfsobject/section-558297e8-2efe-4faa-ba83-5465d6de3099/partitions/FTPR/�|E�+,]��u���O.module) but most are fine (so unpacking the file seems to work in general). Could it be a different underlying problem with encodings or offsets?

e.g. happens for Dell Latitude 3160 UEFI contained in 3160A08.exe from the dell support site

theopolis commented 1 year ago

Thanks @jstucke, this is a great find. If you have time to submit a pull request then I am happy to review and we'll get it merged.

jstucke commented 1 year ago

Sure, I can submit a pull request. What do you think would be the best solution? Ignoring the characters that cannot be decoded or replacing them? Using the hex value could also be a good alternative. I would prefer the latter, because then no data would get lost during decoding.

example output for offending file that caused the error: errors=ignore errors=replace hex()
'\|E\x19+,]uO' '�\|E�\x19+,]��u���O' 'ec7c45eb9b192b2c5daff575f9ceda4f'
theopolis commented 1 year ago

I agree, I wonder if an indicator like 0x[ec7c45eb9b192b2c5daff575f9ceda4f] could be used (where you also add the 0x[ prefix and ] suffix? This way you could differentiate between someone using the static value ec7c45eb9b192b2c5daff575f9ceda4f and the tool auto-encoding.

jstucke commented 1 year ago

I opened a PR but I'm a bit puzzled regarding Python version compatibility. Is Python 2.7 still supported?

theopolis commented 1 year ago

Is Python 2.7 still supported

I was leaving it in, in so much that it wasn't a pain to support. So it's safe to say no.