eliben / pyelftools

Parsing ELF and DWARF in Python
Other
2.04k stars 512 forks source link

Unable to handle notes following a NT_GNU_PROPERTY_TYPE_0 note in the same segment #534

Closed haadr closed 10 months ago

haadr commented 10 months ago

Hia!

All below is tested with pyelftools Version: 0.30.

NOTE For me to be able to reproduce the below behavior, I had to first address https://github.com/eliben/pyelftools/issues/535.

Context

I've hit an issue where qt6 has started adding note entries to elf segments and sections to mark an elf file as a qt plugin.

The new note, .note.qt.metadata is referenced both as part of a segment and as its own section. The new note seems to be appended to a segment already containing a NT_GNU_PROPERTY_TYPE_0

Example:

# readelf --segments libqtwebenginequickplugin.so
Elf file type is DYN (Shared object file)
Entry point 0x0
There are 11 program headers, starting at offset 64

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  LOAD           0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000001658 0x0000000000001658  R      0x1000
  LOAD           0x0000000000002000 0x0000000000002000 0x0000000000002000
                 0x00000000000006f5 0x00000000000006f5  R E    0x1000
  LOAD           0x0000000000003000 0x0000000000003000 0x0000000000003000
                 0x0000000000000398 0x0000000000000398  R      0x1000
  LOAD           0x0000000000003b70 0x0000000000004b70 0x0000000000004b70
                 0x0000000000000530 0x0000000000000558  RW     0x1000
  DYNAMIC        0x0000000000003cb0 0x0000000000004cb0 0x0000000000004cb0
                 0x0000000000000260 0x0000000000000260  RW     0x8
  NOTE           0x00000000000002a8 0x00000000000002a8 0x00000000000002a8
                 0x00000000000000a8 0x00000000000000a8  R      0x8
  NOTE           0x0000000000000350 0x0000000000000350 0x0000000000000350
                 0x0000000000000024 0x0000000000000024  R      0x4
  GNU_PROPERTY   0x00000000000002a8 0x00000000000002a8 0x00000000000002a8
                 0x0000000000000030 0x0000000000000030  R      0x8
  GNU_EH_FRAME   0x000000000000309c 0x000000000000309c 0x000000000000309c
                 0x00000000000000ac 0x00000000000000ac  R      0x4
  GNU_STACK      0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000000 0x0000000000000000  RW     0x10
  GNU_RELRO      0x0000000000003b70 0x0000000000004b70 0x0000000000004b70
                 0x0000000000000490 0x0000000000000490  R      0x1

 Section to Segment mapping:
  Segment Sections...
   00     .note.gnu.property .note.qt.metadata .note.gnu.build-id .hash .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_d .gnu.version_r .rela.dyn .rela.plt 
   01     .init .plt .plt.got .text .fini 
   02     .rodata .eh_frame_hdr .eh_frame 
   03     .init_array .fini_array .data.rel.ro .dynamic .got .data .qtversion .bss 
   04     .dynamic 
   05     .note.gnu.property .note.qt.metadata 
   06     .note.gnu.build-id 
   07     .note.gnu.property 
   08     .eh_frame_hdr 
   09     
   10     .init_array .fini_array .data.rel.ro .dynamic .got 

Bug :question:

When trying to iterate over all notes by iterating over segments, then each segment's notes, iter_notes crashes like so:

Traceback (most recent call last):
  File "pyelftools/elftools/common/utils.py", line 43, in struct_parse
    return struct.parse_stream(stream)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "pyelftools/elftools/construct/core.py", line 190, in parse_stream
    return self._parse(stream, Container())
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "pyelftools/elftools/construct/core.py", line 647, in _parse
    subobj = sc._parse(stream, context)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "pyelftools/elftools/construct/core.py", line 825, in _parse
    obj = self.cases.get(key, self.default)._parse(stream, context)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "pyelftools/elftools/construct/core.py", line 385, in _parse
    return _read_stream(stream, self.lengthfunc(context))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "pyelftools/elftools/construct/core.py", line 293, in _read_stream
    raise FieldError("expected %d, found %d" % (length, len(data)))
elftools.construct.core.FieldError: expected 1701734759, found 18032

Without being an elf-expert, it seems to me that the reason for that is that

Fix :question:

A simple fix seems to be something like

diff --git a/elftools/elf/notes.py b/elftools/elf/notes.py
index c708cbf..bec95ca 100644
--- a/elftools/elf/notes.py
+++ b/elftools/elf/notes.py
@@ -49,7 +49,8 @@ def iter_notes(elffile, offset, size):
         elif note['n_type'] == 'NT_GNU_PROPERTY_TYPE_0':
             off = offset
             props = []
-            while off < end:
+            current_note_end = offset + note['n_descsz']
+            while off < current_note_end:
                 p = struct_parse(elffile.structs.Elf_Prop, elffile.stream, off)
                 off += roundup(p.pr_datasz + 8, 2 if elffile.elfclass == 32 else 3)
                 props.append(p)

Reproduction

Script used to reproduce error (based on elf_notes.py example)

from __future__ import print_function
import sys

# If pyelftools is not installed, the example can also run from the root or
# examples/ dir of the source distribution.
sys.path[0:0] = ['.', '..']

from elftools.elf.elffile import ELFFile
from elftools.elf.segments import NoteSegment

def process_file(filename):
    print('Processing file:', filename)
    with open(filename, 'rb') as f:
        for segment in ELFFile(f).iter_segments():
            if not isinstance(segment, NoteSegment):
                continue
            for note in segment.iter_notes():
                print('    Name:', note['n_name'])
                print('    Type:', note['n_type'])
                desc = note['n_desc']
                if note['n_type'] == 'NT_GNU_ABI_TAG':
                    print('    Desc: %s, ABI: %d.%d.%d' % (
                        desc['abi_os'],
                        desc['abi_major'],
                        desc['abi_minor'],
                        desc['abi_tiny']))
                elif note['n_type'] in {'NT_GNU_BUILD_ID', 'NT_GNU_GOLD_VERSION', 'NT_GNU_PROPERTY_TYPE_0'}:
                    print('    Desc:', desc)
                else:
                    print('    Desc:', "".join(map(chr, desc)))

if __name__ == '__main__':
    if sys.argv[1] == '--test':
        for filename in sys.argv[2:]:
            process_file(filename)

readelf output for shared object file which triggers this is attached.

example shared object file which triggers this is attached. it's libqtwebenginequickplugin.so from a standard openSUSE Tumbleweed install.

readelf-a.txt

libqtwebenginequickplugin.so.txt

martijnthe commented 10 months ago

I've put up a fix PR here: https://github.com/eliben/pyelftools/pull/538

cc @eliben