noseglasses / elf_diff

A tool to compare ELF binaries
GNU General Public License v3.0
171 stars 21 forks source link

Exit status does not report "same" or "different", making it impossible to write a script around elf_diff #107

Open derekatkins opened 8 months ago

derekatkins commented 8 months ago

Describe the bug I am trying to use elf_diff in a script to compare whether there was a change in code between two compiles of (ostensibly) the same code. Because this is in a script, I was looking to use the exit code from elf_diff (similar to how you can use the exit code from diff, or cmp, to know if the files being compared are the same or have differences). Unfortunately elf_diff appears to always exit with the same exit code regardless of whether there are actual differences.

To Reproduce Steps to reproduce the behavior:

  1. run comparing a file to itself: elf_diff /bin/bash /bin/bash
  2. Check the output status (in this case, 0)
  3. re-run with files that are definitely different: elf_diff /bin/bash /bin/true
  4. Check the output status and see that it is again 0

Expected behavior I expected the exit status to be a summary of whether or not the compared files are the same. For example, if you use 'cmp' instead then when comparing /bin/bash to itself you get an exit status of 0, and when comparing bash to true you get an exit status of one. This makes it scriptable.

noseglasses commented 8 months ago

Thanks for the detailed report. Currently, elf_diff uses the exit code to report errors during processing rather than actual differences between binaries. There's currently no way to directly achieve what you intent. The tool was designed to generate verbose reports, rather than just returning true/false.

However, what you are asking for would be a really nice feature. But there are other more simple tools that can already tell if two binaries are different. If a true binary comparison e.g. of stripped binaries or hex-files would not work for you, I wonder whether you are rather looking for something that answers the question whether two binaries are equivalent. How is such an equivalence defined? Would it e.g. mean that the stripped binaries are equal and only symbol names changed.

Anyway, it would be easy to add an additional command line flag that would make elf_diff report true/false via its exit code.

derekatkins commented 8 months ago

Hi. Thank you for the reply. The use-case I have is that I have an embedded firmware system that (for better or worse) re-links every library and every executable every time it "rebuilds" the target, even if it never recompiles anything, even if no code changed. So what I want is to find a tool that can tell me whether the code itself has changed, ignoring extraneous metadata like file paths, link dates, etc. I have tried several different tools so far and yours is, by far, the best at actually noticing that, e.g., a re-linked libfoobar.so has NOT changed, even when cmp says they have. My end-goal is to write a script that takes two file trees and gives me the "exe-diff' between them, so I can send the fewest "changed" files between versions.

noseglasses commented 8 months ago

Ok, got it. That's kind of what I had in mind when I wrote about "equivalence".

I will see what I can do with enabling this feature.

For the time being a possible workaround would be to write a small wrapper script that makes elf_diff write its report to a temp-file, grep the report to find out if there are differences and then let the script return the exit code that you expected of elf_diff.

An addition to the script approach would be to replace the html plugin with a custom version that simply writes the sum of counts of the migrated, disappeared, ... symbols to a text file. That would greatly simply parsing the report. That solution, however, requires some knowledge of Python's Jinja package.

noseglasses commented 7 months ago

I just added a plugin that creates text files that only contain statistics of the diff. The change has not been released yet but is available in current master. @derekatkins, could you give that a try and let me know if it works for you? Run with elf_diff --stats_txt_file <old_binary> <new_binary>. If both files are "equivalent" then the file contains the string "No significant differences.". Otherwise it says "File differ.".

fkerle commented 6 months ago

@noseglasses Hi & thank your for sharing your work!

I'm in a diffrent situation, but have tried --stats_txt_file and wanted to report back: If I'm not mistaken, elf_diff does detect assembly (instruction) differences?
If so, then I'm puzzled why the --stats_txt_file reports No significant differences even when diff <(arm-none-eabi-objdump -d a) <(arm-none-eabi-objdump -d b) does show differences. (elf_diff was using the same binutils, using --bin_dir and --bin_prefix)

~The only thing I could come up with, is that elf_diff is so smart, to recognize relocated variables and don't flag changes, i.e. movw, from their new addresses vs. the old addresses?~

EDIT: I found that I can grep -Ri '<table class="diff"' to see if tables listing OLD vs. NEW changes exist, which in my case they indeed do. Why does elf_diff regard those changes as not significant?

Would it not be practical to have a way too look at a summary of those changes too? (i.e. the left hand navigation)

Any help would be greatly appreciated. Happy holidays from Austria, Florian

noseglasses commented 6 months ago

@fkerle, could you possibly provide a minimum example with two binary files and the exact binutils-versions you are using? THX

noseglasses commented 6 months ago

...happy holidays to you, too!

noseglasses commented 6 months ago

Just had a try with the test binaries tests/arm/libelf_diff_test2_debug_old.a and tests/arm/libelf_diff_test2_debug_new.a from the elf_diff repo. For these differences are correctly reported in the stats-textfile. So I really need your files @fkerle, to see what's going wrong. I am using arm-none-eabi-objdump 2.38 on Ubuntu 22.04.

noseglasses commented 6 months ago

Regarding the question whether elf_diff is smart with assembly and relocations. No, it's not. It completely relies on binutils and only compares (mangled) symbol names, symbol sizes and assembly code (text-based).

fkerle commented 5 months ago

Hi @noseglasses thanks for your swift reply, I'd been on holiday (and sick leave for that matter) shortly after.

If I understand your previous comment correctly, assembly code difference should trigger a significant difference?

Example: formatted-main.elf <> formatted-none.elf ELFs.zip (formatted-main has formatting changes in main.c)

comparing assembly code manually shows differences. As does the multipage report, for at least MX_ADC1_Init:
multipage_pair_report/details/persisting/1745.html#persisting_symbol_details_1745 multipage_pair_report.zip

--stats_txt_file however shows "no significant differences": formatted-stats.txt

command for diffing the ELFs was: python bin/elf_diff --stats_txt_file formatted-stats.txt --bin_dir "/opt/st/stm32cubeide_1.12.0/plugins/com.st.stm32cube.ide.mcu.externaltools.gnu-tools-for-stm32.10.3-2021.10.linux64_1.0.200.202301161003/tools/bin" --bin_prefix "arm-none-eabi-" /media/D/formatted-*

binutils version: gnu-tools-for-stm32.10.3-2021.10.linux64_1.0.200.202301161003 they are available for free from the st download site, for the ST IDE. I can pack the binutils you need, if that makes debugging easier for you.

noseglasses commented 5 months ago

@fkerle , thanks for the detailed information. Yes, assembly differences are colsidered "significant". I will try to find out what is going wrong when using your binaries...

fkerle commented 5 months ago

@noseglasses you're welcome. I appreciate you taking the time to investigate. let me know if I should provide the toolchain (and which executables) for you.

BR Florian

noseglasses commented 5 months ago

@fkerle, sorry I was busy last week so it took me a while to come back to this. I merged a fix for the reported problem to master that now also considers persisting symbols' assembly differences significant. Please let me know if that works for you.

fkerle commented 5 months ago

@noseglasses no need to apologize. the change seems to work: Differences are reported! formatted-stats.txt