google / bindiff

Quickly find differences and similarities in disassembled code
https://zynamics.com/bindiff.html
Apache License 2.0
2.22k stars 136 forks source link

Importing symbols from a loaded diff in IDA is significantly slower than a diff performed within IDA #3

Open cblichmann opened 1 year ago

cblichmann commented 1 year ago

Steps to reproduce the problem:

  1. Load two large binaries and create IDBs
  2. Bindiff one of them against the other
  3. Import a couple of symbols and note the time it takes
  4. Save the diff results
  5. Restart IDA
  6. Load saved diff results
  7. Import the same number of symbols and again note the time it takes.

What is the expected behavior?

It should take about the same amount of time.

What went wrong?

I bindiffed bindiff (haha) against binexport and put breakpoints on calls to BinExport2::BinExport2 and google::protobuf::MessageLite::ParsePartialFromIstream. When working with a fresh diff performed within IDA bindiff does not load the binexport file for every symbol, but when working with loaded diff results binexport will load the binexport files for EACH symbol that's being ported.

What version of the product are you using? On what operating system? Bindiff 7, windows x64, IDA 7.6.210427

Any other comments? No

Ported from b/199001147

cblichmann commented 1 year ago

Hi there, Thanks for the report. This is a somewhat known issue. The reason for this inefficiency is that we switched to BinExport v2 in BinDiff 4.3 where the newer BinExport format is one big proto that cannot be parsed partially. To fix the performance issue, we need to refactor the this logic to only load the proto once and cache it, similar to the fast path you describe.

cblichmann commented 1 year ago

b/200299140 is a duplicate of this.