Open ducphanduyagentp opened 2 weeks ago
Yeah that machine looks pretty good. From the log it seems that the error happens during the BSIM correlation. As BSIM is a new feature from Ghidra, perhaps it is running into an issue?
INFO | ghidriff | Starting BSIM correlator
INFO | ghidriff | Match Set 0 - 130832 matches [Correlator=Manual Match]
INFO | ghidriff | Match Set -1 - 0 matches [Correlator=Implied Match]
If you have this analysis already in Ghidra, you could test BSIM (to find out if BSIM is breaking) by doing the following.
If all that sounds unfamiliar, take a look at my VT tutorial https://cve-north-stars.github.io/docs/Ghidra-Patch-Diffing
BSIM, in this instance looking at the above log, is starting with 130,832 seed matches, used in the code here: https://github.com/clearbluejar/ghidriff/blob/0ce2bbffb3cd5d9a5e8f544f7ebaad539635bf16/ghidriff/bsim.py#L95-L113
Maybe it is having trouble? You could verify by running steps 1-4 in version tracking in Ghidra. To help is understand if this is a Ghidra issue or something in ghidriff
.
Are you able to provide the binaries you are diffing, or even tell me how large they are?
You could also try to run ghidriff
with the --no-bsim
flag to rule out BSIM as an issue.
This might mean there are issue later in the diffing pipeline, but something to try.
If you can provide the binaries, or other less sensitive binaries that present a similar issue, it would provide the best insight , and help ghidriff better handle larger bins in the future.
I am trying to diff without bsim to see if that still happens. Nonetheless it would be nice to have bsim results with it :D because it sounds pretty promising.
The binaries are nothing confidential, I was diffing 2 latest Foxit Reader versions. The binaries are about 113 MB each. It's worth noting that I tried analyzing it with IDA Pro, Ghidra and Binary Ninja but all took a very long time so it's already big regarding the initial analysis time. Binary Ninja was the fastest because it took advantages of all cores on my machine
Unfortunately without BSIM it still happens
INFO | ghidriff | Init Ghidra Diff Engine...
INFO | ghidriff | Engine Console Log: INFO
INFO | ghidriff | Engine File Log: ghidriffs/ghidriff.log INFO
INFO | ghidriff | Starting Ghidra...
INFO Using log config file: jar:file:/home/user/Downloads/ghidra_11.1.1_PUBLIC/Ghidra/Framework/Generic/lib/Generic.jar!/generic.log4j.xml (LoggingInitialization)
INFO Using log file: /home/user/ghidriffs/ghidriff.log (LoggingInitialization)
INFO Loading user preferences: /home/user/.config/ghidra/ghidra_11.1.1_PUBLIC/preferences (Preferences)
INFO Searching for classes... (ClassSearcher)
INFO Ignoring class 'ghidra.GhidraClassLoader' from '/home/user/Downloads/ghidra_11.1.1_PUBLIC/Ghidra/Framework/Utility/lib/Utility.jar'. Already found at '/home/user/Downloads/ghidra_11.1.1_PUBLIC/Ghidra/Framework/Utility/lib/Utility.jar'. (ClassSearcher)
INFO Ignoring class 'generic.jar.GClassLoader' from '/home/user/Downloads/ghidra_11.1.1_PUBLIC/Ghidra/Framework/Utility/lib/Utility.jar'. Already found at '/home/user/Downloads/ghidra_11.1.1_PUBLIC/Ghidra/Framework/Utility/lib/Utility.jar'. (ClassSearcher)
INFO Class search complete (471 ms) (ClassSearcher)
INFO Initializing SSL Context (SSLContextInitializer)
INFO Initializing Random Number Generator... (SecureRandomFactory)
INFO Random Number Generator initialization complete: NativePRNGNonBlocking (SecureRandomFactory)
INFO Trust manager disabled, cacerts have not been set (ApplicationTrustManagerFactory)
INFO | ghidriff | GHIDRA_INSTALL_DIR: /home/user/Downloads/ghidra_11.1.1_PUBLIC
INFO | ghidriff | GHIDRA 11.1.1 Build Date: 2024-Jun-14 1025 EDT Release: PUBLIC
INFO | ghidriff | Engine Args:
INFO | ghidriff | old: ['old.exe.gzf']
INFO | ghidriff | new: [['new.exe.gzf']]
INFO | ghidriff | engine: VersionTrackingDiff
INFO | ghidriff | output_path: ghidriffs
INFO | ghidriff | summary: False
INFO | ghidriff | project_location: ghidra_projects
INFO | ghidriff | project_name: ghidriff
INFO | ghidriff | symbols_path: symbols
INFO | ghidriff | threaded: True
INFO | ghidriff | force_analysis: False
INFO | ghidriff | force_diff: True
INFO | ghidriff | no_symbols: True
INFO | ghidriff | log_level: INFO
INFO | ghidriff | file_log_level: INFO
INFO | ghidriff | log_path: ghidriff.log
INFO | ghidriff | va: False
INFO | ghidriff | min_func_len: 10
INFO | ghidriff | use_calling_counts: False
INFO | ghidriff | gdt: []
INFO | ghidriff | bsim: False
INFO | ghidriff | bsim_full: False
INFO | ghidriff | max_ram_percent: 100
INFO | ghidriff | print_flags: False
INFO | ghidriff | jvm_args: None
INFO | ghidriff | side_by_side: False
INFO | ghidriff | max_section_funcs: 200
INFO | ghidriff | md_title: None
INFO | ghidriff | Setting Up Ghidra Project...
INFO Creating project: /home/user/ghidriffs/ghidra_projects/ghidriff-old.exe.gzf-new.exe.gzf/ghidriff-old.exe.gzf-new.exe.gzf (DefaultProject)
INFO | ghidriff | Created project: ghidriff-old.exe.gzf-new.exe.gzf
INFO | ghidriff | Project Location: /home/user/ghidriffs/ghidra_projects/ghidriff-old.exe.gzf-new.exe.gzf/
INFO | ghidriff | Importing old.exe.gzf as old.exe.gzf-b7fb88
INFO Using Loader: GZF Input Format (AutoImporter)
INFO Using Language/Compiler: null (AutoImporter)
INFO | ghidriff | Loaded old.exe - .ProgramDB
INFO | ghidriff | Importing new.exe.gzf as new.exe.gzf-a74d23
INFO Using Loader: GZF Input Format (AutoImporter)
INFO Using Language/Compiler: null (AutoImporter)
INFO | ghidriff | Loaded new.exe - .ProgramDB
INFO | ghidriff | Project Files:
INFO | ghidriff | ghidriff-old.exe.gzf-new.exe.gzf:/old.exe.gzf-b7fb88
INFO | ghidriff | ghidriff-old.exe.gzf-new.exe.gzf:/new.exe.gzf-a74d23
INFO | ghidriff | Program: old.exe.gzf-b7fb88 imported: True has_pdb: False pdb_loaded: False analyzed True
INFO | ghidriff | Program: new.exe.gzf-a74d23 imported: True has_pdb: False pdb_loaded: False analyzed True
INFO | ghidriff | Starting analysis for 2 binaries
INFO | ghidriff | Analyzing: new.exe - .ProgramDB
Using file gdts: [windows_vs12_32]
INFO | ghidriff | Analyzing: old.exe - .ProgramDB
INFO | ghidriff | Analysis already complete.. skipping new.exe - .ProgramDB!
Using file gdts: [windows_vs12_32]
INFO | ghidriff | Analysis already complete.. skipping old.exe - .ProgramDB!
INFO | ghidriff | Analysis for ghidriff-old.exe.gzf-new.exe.gzf:/new.exe.gzf-a74d23 complete
INFO | ghidriff | Analysis for ghidriff-old.exe.gzf-new.exe.gzf:/old.exe.gzf-b7fb88 complete
INFO | ghidriff | Diffing bins: old.exe.gzf - new.exe.gzf
INFO | ghidriff | Setup 48 decompliers
INFO | ghidriff | Loaded old program: old.exe
INFO | ghidriff | Loaded new program: new.exe
INFO | ghidriff | p1 sym count: reported: 2338147 analyzed: 52645
INFO | ghidriff | p2 sym count: reported: 2300389 analyzed: 52641
INFO | ghidriff | Found unmatched: 1126 matched: 52080 symbols
INFO Hashing symbols in old.exe (ConsoleTaskMonitor)
INFO Hashing symbols in new.exe (ConsoleTaskMonitor)
INFO Eliminate non-unique matches (ConsoleTaskMonitor)
INFO Finding symbol matches (ConsoleTaskMonitor)
INFO | ghidriff | Exec time: 0.8740 secs
INFO | ghidriff | Match count 126080
INFO | ghidriff | Counter({('SymbolsHash',): 10196})
INFO | ghidriff | Running correlator: ExactBytesFunctionHasher
INFO | ghidriff | name: ExactBytesFunctionHasher one_to_one: True one_to_many: False
INFO Hashing functions in old.exe (ConsoleTaskMonitor)
INFO Hashing functions in new.exe (ConsoleTaskMonitor)
INFO Finding function matches (ConsoleTaskMonitor)
INFO | ghidriff | Match count: 38299
INFO | ghidriff | ExactBytesFunctionHasher Exec time: 46.4411 secs
INFO | ghidriff | Running correlator: ExactInstructionsFunctionHasher
INFO | ghidriff | name: ExactInstructionsFunctionHasher one_to_one: True one_to_many: False
INFO | ghidriff | Match count: 82287
INFO | ghidriff | ExactInstructionsFunctionHasher Exec time: 42.7090 secs
INFO | ghidriff | Running correlator: StructuralGraphExactHash
INFO | ghidriff | name: StructuralGraphExactHash one_to_one: True one_to_many: False
INFO | ghidriff | Match count: 873
INFO | ghidriff | StructuralGraphExactHash Exec time: 173.6129 secs
INFO | ghidriff | Running correlator: ExactMnemonicsFunctionHasher
INFO | ghidriff | name: ExactMnemonicsFunctionHasher one_to_one: True one_to_many: False
INFO | ghidriff | Match count: 50
INFO | ghidriff | ExactMnemonicsFunctionHasher Exec time: 39.9167 secs
INFO | ghidriff | Running correlator: BSIM
INFO | ghidriff | name: BSIM one_to_one: True one_to_many: False
INFO | ghidriff | Skipping BSIM correlator. BSIM disabled with arg --no-bsim
INFO | ghidriff | BSIM Exec time: 0.0001 secs
INFO | ghidriff | Running correlator: BulkInstructionHash
INFO | ghidriff | name: BulkInstructionHash one_to_one: True one_to_many: False
INFO | ghidriff | Match count: 3
INFO | ghidriff | BulkInstructionHash Exec time: 88.4950 secs
INFO | ghidriff | Running correlator: SigCallingCalledHasher
INFO | ghidriff | name: SigCallingCalledHasher one_to_one: True one_to_many: False
INFO | ghidriff | Match count: 1607
INFO | ghidriff | SigCallingCalledHasher Exec time: 72.1080 secs
INFO | ghidriff | Running correlator: StringsRefsHasher
INFO | ghidriff | name: StringsRefsHasher one_to_one: True one_to_many: False
INFO | ghidriff | Match count: 3790
INFO | ghidriff | StringsRefsHasher Exec time: 94.8933 secs
INFO | ghidriff | Running correlator: StrUniqueFuncRefsHasher
INFO | ghidriff | name: StrUniqueFuncRefsHasher one_to_one: True one_to_many: False
INFO | ghidriff | Match count: 680
INFO | ghidriff | StrUniqueFuncRefsHasher Exec time: 13.6850 secs
INFO | ghidriff | Running correlator: SwitchSigHasher
INFO | ghidriff | name: SwitchSigHasher one_to_one: True one_to_many: False
INFO | ghidriff | Match count: 121
INFO | ghidriff | SwitchSigHasher Exec time: 56.7480 secs
INFO | ghidriff | Running correlator: StructuralGraphHash
INFO | ghidriff | name: StructuralGraphHash one_to_one: True one_to_many: True
#
# A fatal error has been detected by the Java Runtime Environment:
#
# SIGSEGV (0xb) at pc=0x00007ac0f1c21b3d, pid=502616, tid=502616
#
# JRE version: OpenJDK Runtime Environment (21.0.3+9) (build 21.0.3+9-Ubuntu-1ubuntu122.04.1)
# Java VM: OpenJDK 64-Bit Server VM (21.0.3+9-Ubuntu-1ubuntu122.04.1, mixed mode, tiered, compressed class ptrs, g1 gc, linux-amd64)
# Problematic frame:
# V [libjvm.so+0x821b3d] void G1ConcurrentRefineOopClosure::do_oop_work<oopDesc*>(oopDesc**)+0x4d
#
# Core dump will be written. Default location: Core dumps may be processed with "/usr/share/apport/apport -p%p -s%s -c%c -d%d -P%P -u%u -g%g -- %E" (or dumping to /home/user/core.502616)
#
# An error report file with more information is saved as:
# /home/user/hs_err_pid502616.log
[948.357s][warning][os] Loading hsdis library failed
#
# If you would like to submit a bug report, please visit:
# https://bugs.launchpad.net/ubuntu/+source/openjdk-21
#
[1] 502616 IOT instruction (core dumped) ghidriff --force-diff --max-ram-percent 100 --no-symbols --no-bsim
Hi,
I've encountered this error multiple times when diffing large binaries. I've tried tweaking max RAM percent, changing JDK, changing options, exporting to Ghidra Zip File from Ghidra UI, but it will not complete the diff. This specific instance, I got SIGSEGV, and some others I got SIGBUS.
My machine has 64GB RAM and 16GB swap, and a lot of storage, and a pretty fast CPU. I've looked this error up and nothing much has come up. Please advise. I've been running and waiting for hours and also tried the Docker container, nothing works.
Thanks!