microsoft / SizeBench

SizeBench is a binary size investigation tool for Windows
MIT License
103 stars 14 forks source link

Fix a couple issues in BinaryBytes #38

Closed Austin-Lamb closed 7 months ago

Austin-Lamb commented 7 months ago

Why is this change being made?

When BinaryBytes uses include-all-symbols a couple of bugs have been observed - multiple symbols at the same RVA kept being seen, which would then fail to build the database because the "symbol RVA -> symbol ID" map during insert to SQL would fail to add two entries with the same RVA.

Briefly summarize what changed

Symbols aren't guaranteed to be ordered, so we could end up having a symbol in the first byte of a COFF Group and then we'd also create the "padding at beginning of COFF Group" symbol and then there's two BinaryBytes items at the same RVA. The fix is to order the symbols by RVA.

Then, the next problem is some symbols can have zero size (like a data symbol that represent a char[0] in C++). So that means the zero-byte symbol and the next one can also share an RVA - the fix is to ignore zero-byte symbols as they're not relevant for BinaryBytes anyway.  

How was the change tested?

Manually with a very large and complex binary that exhibited all these symptoms.

PR Checklist