thixotropist / ghidra_import_tests

Experimental framework for testing Ghidra binary import support
1 stars 0 forks source link

Test emulation of vectorized RISCV64 application #23

Open thixotropist opened 1 month ago

thixotropist commented 1 month ago

Develop an emulation test for RISCV64 applications with complex ISA extensions and optimization. The current (Ghidra 11.1-DEV) sleigh code for RISCV uses many user pcode operations without semantics. Optimizing compilers like gcc-14 will often emit these instructions, breaking Ghidra emulation.

This will be an example of test driven design, where we set up a test case that Ghidra will fail so as to evaluate different design approaches to success. This project leans to network applications, so we will base the test on a component of the network processing framework, DPDK.

The initial test target will be the trie lookup subsystem used by rte_acl_classify within the dpdk-l2fwd demo application. This subsystem performs a lookup on packet label information (ethernet, vlan, or mpls) to determine the next-hop handler. It should accept a list of packets as input and generate a list of next hop handlers as outputs, using vector and bit handling instructions provided by the local microarchitecture. Packet ordering within the lists must be preserved for packets with the same lookup value. The DPDK already provides vector implementations of rte_acl_classify for Intel AVX and ARM Neon vector processors.

Since this test case is intended to illuminate design choices, we will use the qemu riscv vector emulator code to estimate the complexity of a native Ghidra riscv vector and bit manipulation capability.

Optional extension

Add Machine Learning to the code being analyzed. In this case the network appliance has defined many possible layer 2 classifiers, and can dynamically select between them at run time to minimize power consumption, CPU burden, memory saturation, temperature, and latency. This is an extreme stretch goal, since the lookup structures are dynamic and need to evolve with many hours of heavy network load.