eliben / pyelftools

Parsing ELF and DWARF in Python
Other
1.99k stars 507 forks source link

Migrate to Construct 2.10 take 2 #548

Open sevaa opened 5 months ago

sevaa commented 5 months ago

Move away from an internal copy of Construct to the latest version in order to support DWARF v5 required data types.

Supersedes #477. Performance wise, we could do better, but I'm afraid for that we'd need to start monkeypatching or proper-patching constructs. Compilation is the biggest road to parsing speedup, but a ton of parser features that we use don't support compiling.

In the current iteration, on the performance test of firehose parsing of DWARF from memory, the decrease in performance compared to the main branch, is ~30%. That's not the best result that I've got as I was fiddling with settings (mostly compile vs. no compile); in the best test result, the performance decrease was 15%.

The original PR was mostly motivated by built-in support for 24 bit ints. To that I might add that construct 2.10 has built-in support for ULEB128 (but not SLEB128) and some miscellanea like StreamOffset. That said, I can see that there is still some slack for performance improvement - if one is willing to mess with construct internals.

On the code style, with construct 2.10, the code becomes somewhat cleaner - e. g. there is no longer a notion that every parser object needs a name just in case it's included in a larger struct, and the premade parsers (the kind that is built in structs) are objects, not callables.

eliben commented 5 months ago

Thanks for working on this.

I'm still pondering whether it's worth it, given the performance regression. The upside doesn't seem too large from the original description in #477

@sevaa what additional functionality does this enable that we couldn't achieve without upgrading?

What are the latest performance numbers (I know you posted some in #477, but since this PR supersedes it, it's worth having here for completeness)

sevaa commented 5 months ago

Edited. There is no additional functionality that we could not achieve without upgrading :) I don't have a problem slapping together a constructs based parser for Int24, if needed. I don't know if SupremeMortal has a binary with those or it's just striving for perfection. I don't recall off the top of my head where in the DWARF standard is this datatype needed.

The big promise of construct 2.10 is compile(). From what I've read, compile() mostly shines on pypy as opposed to the vanilla CPython. On CPython it's a mixed blessing - I've seen compiling slowing things down. There is no magic to compile, it merely eliminates the overhead of dispatching (function calls, object creation, dictionary lookups, etc.) - the parsing logic per se is the same. We might get more mileage out of leveraging the structs machinery.

eliben commented 5 months ago

Thanks for the perspective. I remain unconvinced that this is worthwhile at this time. Performance is very important to some users of pyelftools, and while it's not exactly fast - making it even slower doesn't sound too appealing without a very good reason for the change.

sevaa commented 5 months ago

Okay then, kill both PRs, I won't cry :) This exercise did give me some ideas on possible performance enhancement though.

eliben commented 5 months ago

Placing this on hold and closed the older PR.

sevaa commented 4 months ago

Found the spec and a use case for 3 byte ints: DWARFv5 contains form DW_FORM_strx3, which pyelftools has an explicit TODO on. I, on my side, have a crash report about it being unsupported.

sevaa commented 4 months ago

@eliben Armed with a profiler, I did some exploratory poking around regarding performance enhancement. My benchmark was firehose parsing of all DIEs in every file in the dwarfdump autotest directory (optionally with follow-up into loclists, rangelists, and lineprograms).

I was able to reduce the time of the DIE-only scroll-through by 28%, of the scroll-through with follow-up by 34%.

I made the following enhancements, in the rough order of effect:

No breaking changes to the API.

Could poke some more, or could send a PR.

eliben commented 4 months ago

Sounds great -- looking forward to the PR. Please include the benchmarking harness too -- it's OK to split to multiple PRs if that makes sense