eliben / pyelftools

Parsing ELF and DWARF in Python
Other
1.99k stars 507 forks source link

Skip the cache of DWARFInfo and CU. #540

Open ThinkerYzu opened 7 months ago

ThinkerYzu commented 7 months ago

Add DWARFInfo.skip_cache() and DWARFInfo.enable_cache() to give users the ability of controlling cache.

For the case of parsing the DWARF of a large binary, we may want to skip the cache to release the memory ASAP, avoiding extra CPU cycles on maintaining a cache.

One of my use cases is to extract types, functions, and call sites from the DWARF of a Linux kernel image. With caches, it takes about 573 seconds to go through all DIEs. Skipping caches reduces time to 448 seconds. It is about 27% faster. When going through every DIEs sequentially, cache doesn't help use at all.

ThinkerYzu commented 7 months ago

The switch should be positive (enable cache, not skip cache), there should be much more comments and there should be tests.

Do you mean that caches should be disabled by default?

eliben commented 6 months ago

The switch should be positive (enable cache, not skip cache), there should be much more comments and there should be tests.

Do you mean that caches should be disabled by default?

No.

It can be something like "enable cache" which is true by default but can be set to false if needed