google / bloaty

Bloaty: a size profiler for binaries
Apache License 2.0
4.71k stars 340 forks source link

Report ELF section/segment headers as such. #284

Closed haberman closed 3 years ago

haberman commented 3 years ago

Previously we attributed section/segment headers themselves to the section/segment they describe. eg. the .text header was counted as part of the .text section. After this change it is reported as part of [ELF Section Headers].

The rationale was that we generally want to account for the entire "footprint" of a given entity, including all of the metadata emitted as part of that entity. For example, for -d symbols we attribute the .eh_frame footprint of each function to the corresponding function. This is much more useful than just seeing a chunk of [.eh_frame] with no breakdown.

However I think the case of section/segment headers is somewhat different. These headers are small and fixed-length, so there is nothing the section/segment itself could have done to grow or shrink its length except not existing to begin with.

Consider a specific example: you have 100 functions that you compile with -ffunction-sections. This generates 100 section headers that total to 6400 bytes, since each section header is 64 bytes. It is not especially useful if Bloaty just reports each section of length N as N+64 to account for the section headers. In fact this obscures the fact that 6400 bytes of the binary are spent on section headers. And that 64 bytes is not really part of the "weight" of each function, the "weight" is actually coming from the -ffunction-sections argument which caused the 100 functions to emit 6400 bytes of section headers instead of 64.

Ultimately, the strongest arguments for this change are:

  1. As described above, the number of sections in the binary (and the resulting total section overhead) is really a function of the way the binary was compiled, and doesn't have much to do with the individual sections themselves or the data contained therein. If the section overhead is too high, the solution is to compile the binary in a different way, not to trim the contents of the individual sections.
  2. If we do not surface the total weight of the section headers in -d sections, there is really no other way to get this information from Bloaty. There is no other data source that will split out section headers from section contents.
  3. It makes the memory maps look much more concise and logical. I think the lit tests in this commit will make that clear. Seeing the section headers attributed by section made the memory map more noisy and less useful.
haberman commented 3 years ago

cc @learn-more I think this is the direction I'd like to go with PE also (see the memory maps in the test cases).

learn-more commented 3 years ago

cc @learn-more I think this is the direction I'd like to go with PE also (see the memory maps in the test cases).

Yeah, will update my PR with this change.