Performance enhancements

sevaa commented 4 months ago

I made the following enhancements, in the rough order of effect:

Rewritten the LEB128 parsers to be less idiomatic but more streamlined
Instantiated the handful of parsers for scalars that are used in all the one off struct_parse calls, and replaced all points of usage
Inlined some stuff on the hot path
Removed unneeded stream_preserve for streams in auxiliary sections

sevaa commented 4 months ago

@eliben

On the topic of performance, there is a disconnect between ELF proper parsing and DWARF parsing. ELF parsing, at least in the typical use case, happens against a file stream. DWARF parsing is all in memory (barring exotic scenarios where users monkeypatch or subclass ELFFile), and yet we still treat DWARF sections as streams instead of byte arrays in memory that they really are. Rather than read()ing (and constructing bytes objects) all the time, the parser could have worked with bytes elements and slices. The workhorse method of parsing primitive datatypes - struct..unpack() - can take a buffer and an offset.

But that would pretty much mean abandoning construct for the DWARF part of it. As far as I understand, construct was meant to work like protobuf, working with wire and file streams.

eliben commented 4 months ago

Yes, I'd rather keep this capability. A more interesting direction would be to enable more incremental DWARF parsing without slurping whole sections into memory

sevaa commented 4 months ago

Reliance on construct is not a capability per se, it's more of an implementation detail. The DWARF parser mostly spits out Python native structures - lists, namedtuples and such. Off the top of my head, the only Construct objects that we get in the public API are CU headers. And the interface of Construct, one there x["a"] and x.a are equivalent, is hardly magical.

OBTW, construct technically has a parse-from-buffer method. Only it works by constructing a BytesIO around it, then parses the stream :) Overhead wise, not a net win.

Anyway, were I to implement buffer style parsing, I won't be getting rid of construct altogether - compound datatypes can stay. I'd implement a buffer+position object that walks like a stream but doesn't quite quack like a stream, and I'd teach the primitive type parsers (there is just a handful) to recognize those. The compound parsers - Struct, Array - can stay as they are.

I gave some thought to incremental parsing of DWARF. One big obstacle to that would be the transforms that DWARF sections undergo - two decompression hooks, and relocations (also the phantom bytes thing, but that's an edge case by far). Compressed streams are not seekable. Also, I'm assuming we are talking about support for extra large binaries here; were were to implement a no-slurp mode, I'm afraid it'd have to be accompanied by a no-cache mode. At least no CU/DIE cache - lack of an abbrev cache, I'm afraid, would be to costly.

But let's assume there are no transforms. I had three possible designs in mind:

Implement a "file window" stream class, one that wraps a proper file stream, maintains its own position/length and seeks as needed
Open the file several times so that there is a file stream per section (dup won't help, dupped handles share their current position)
Memory map the file, construct ByteIO objects around sections and trust the OS to do the right thing

Which one do you think sounds the least crazy?

eliben commented 4 months ago

The mmap idea sounds intriguing

sevaa commented 4 months ago

Basically already done this in #481 :)

eliben / pyelftools

Performance enhancements #557