airbus-cyber / ghidralligator

Apache License 2.0
301 stars 23 forks source link

Missing support for `next2_sym_head` and `next2_head` #4

Closed M3NIX closed 1 month ago

M3NIX commented 2 months ago

Hello,

while working with the tricore architecture I noticed that Ghidralligator is crashing. terminate called after throwing an instance of 'SleighError'

After some digging I noticed that in the process of loading and parsing the .sla file it crashes because of this line:

<next2_sym_head name="inst_next2" id="0x3" scope="0x0"/>

In your code you can find this switch statement which is not looking for the next2 symbol and therefore throwing the error: https://github.com/airbus-cyber/ghidralligator/blob/master/src/slghsymbol.cc#L232-L262

Stacktrace:

Core was generated by `./ghidralligator -m replay -c /code/config.json -D -i /code/input.txt -t'.
Program terminated with signal SIGABRT, Aborted.
#0  0x0000751ab620a9fc in pthread_kill () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) bt full
#0  0x0000751ab620a9fc in pthread_kill () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#1  0x0000751ab61b6476 in raise () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#2  0x0000751ab619c7f3 in abort () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#3  0x0000751ab6545b9e in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
No symbol table info available.
#4  0x0000751ab655120c in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
No symbol table info available.
#5  0x0000751ab6551277 in std::terminate() () from /lib/x86_64-linux-gnu/libstdc++.so.6
No symbol table info available.
#6  0x0000751ab65514d8 in __cxa_throw () from /lib/x86_64-linux-gnu/libstdc++.so.6
No symbol table info available.
#7  0x00005f3659d7c551 in SymbolTable::restoreSymbolHeader(Element const*) [clone .cold] ()
No symbol table info available.
#8  0x00005f3659dd9a73 in SymbolTable::restoreXml(Element const*, SleighBase*) ()
No symbol table info available.
#9  0x00005f3659dcd69e in SleighBase::restoreXml(Element const*) ()
No symbol table info available.
#10 0x00005f3659dc8030 in Sleigh::initialize(DocumentStorage&) ()
No symbol table info available.
#11 0x00005f3659d830e9 in main ()
No symbol table info available.

I have also found this line in the .sla file:

<next2_sym name="inst_next2" id="0x3" scope="0x0"/>

This is not causing any crash, but I am not sure if it is correctly handled by Ghidralligator either because of no mentioning of it in your codebase.

Here you can find a reference to these elements in the Ghidra codebase if that is helping you: https://github.com/NationalSecurityAgency/ghidra/blob/master/Ghidra/Framework/SoftwareModeling/src/main/java/ghidra/pcode/utils/SlaFormat.java#L170-L171

If you can point me to the correct files I can try to create a PR. For the <next2_sym_head I think the implementation is quite easy. I have started a fork. Maybe you can have a look at my commit: https://github.com/M3NIX/ghidralligator/commit/e6da571e19552155209b62d8fc99576a9ead5a4b For the <next2_sym I am not so sure because of the large pcodeparse.cc file.

Thanks!

c-eax commented 2 months ago

Hi, Ghidralligator is base on Ghidra 10.1.5 version which is quite old. Maybe the SLA format changed a little. The good way would be to migrate Ghidralligator code on the latest Ghidra version. In the mean time, can you try to regenerate your SLA using Ghidra 10.1.5 ?

M3NIX commented 2 months ago

Hi, I have downloaded the release package of Ghidra 10.1.5 and there these next2 symbols do not exist in the tricore .sla file. That would match your thought about a changed SLA format in the newer versions.

After that I tried to compile the current spec files with the older ghidra 10.1.5:

git clone git@github.com:NationalSecurityAgency/ghidra.git && cd ghidra
git checkout Ghidra_10.1.5_build 
# copy tricore processor spec files from master to Ghidra/Processors/tricore/data/languages/
gradle -I gradle/support/fetchDependencies.gradle init
gradle tricore:sleighCompile

Unfortunately that resulted in an error which seems to be caused by the usage of a new built-in p-code function lzcount (commit from last Mar 3, 2023) which did not exist back then (10.1.5 was released Jul 27, 2022):

> Task :tricore:sleighCompile FAILED
Compiling ./data/languages/tricore.slaspec:
tricore.sinc:1840: unknown macro, userop, or specific symbol 'lzcount' in macro, user operation, or subpiece application
Unrecoverable error(s), halting compilation
java.lang.NullPointerException: Cannot invoke "ghidra.pcodeCPort.slgh_compile.ExprTree.setOutput(ghidra.sleigh.grammar.Location, ghidra.pcodeCPort.semantics.VarnodeTpl)" because "e" is null
        at ghidra.sleigh.grammar.SleighCompiler.assignment(SleighCompiler.java:6759)
        at ghidra.sleigh.grammar.SleighCompiler.statement(SleighCompiler.java:6001)
        at ghidra.sleigh.grammar.SleighCompiler.statements(SleighCompiler.java:5848)
        at ghidra.sleigh.grammar.SleighCompiler.code_block(SleighCompiler.java:5795)
        at ghidra.sleigh.grammar.SleighCompiler.semantic(SleighCompiler.java:5705)
        at ghidra.sleigh.grammar.SleighCompiler.ctorsemantic(SleighCompiler.java:3573)
        at ghidra.sleigh.grammar.SleighCompiler.constructor(SleighCompiler.java:3483)
        at ghidra.sleigh.grammar.SleighCompiler.constructorlike(SleighCompiler.java:3034)

0 languages successfully compiled
M3NIX commented 2 months ago

I just found the commit in the ghidra repo which introduced the next2 instruction: https://github.com/NationalSecurityAgency/ghidra/commit/8d4a6c213ea252eec6dcb79079a6820a09418584

So I agree with you that the best solution would be to migrate Ghidralligator to a newer Ghidra version! Therefore I have started a PR #5