Closed thixotropist closed 5 months ago
Let's start with the hardest long-term topic, then fill in with the easier material when we hit the inevitable brick wall.
The next version of GCC, gcc-14, apparently adds support for both RISCV intrinsic functions and for auto-vectorization of loops. That suggests we are likely to see vector extension functions appearing in many more places starting mid to late 2024. Ghidra's decompiler simply stops analysis of a function when it hits its first vector instruction, so the first goal involves:
in the wild
. The site https://github.com/riscv-non-isa/rvv-intrinsic-doc/tree/main/examples looks promising for this.memcpy
and strncpy
would be good immediate goals.That sounds easy enough, but the RISCV vector intrinsic functions are built on all possible vector instructions multiplied by all possible vector configurations, apparently resulting in some 28,000 builtin function signatures. That's so many there is no C header file holding them - a generated riscv_vector.h
is compiled into an intermediate representation when the compiler is built, then injected directly into a running GCC-14 instance if and when riscv_vector.h
is imported.
A complete Ghidra solution might involve generating a large set of vector modes, tracking changes to those modes through logic branches, then picking the matching vector instrinsic function pcode operation for each vector instruction. That's far too complicated a solution for today.
Instead, let's simply look to add vector pcode semantics to RISCV sleigh files as we find vector instructions in exemplars. We won't try to completely capture vector state, just give a human user enough context to identify basic vector operations within larger programs.
This project will have a branching-tree structure, as we explore the contexts in which instruction set extensions might appear in contexts relevant to Ghidra. There will be plenty of backtracking, so we need good living documents on where we find RISCV instruction set extensions and what Ghidra can do to help understand them.
The last few commits added a few riscv vector intrinsics, the kinds one might find inside of a libc or matrix math library. Now we want to add a bit more specificity, looking for exemplars that a Ghidra user might run across.
For example:
memcpy
and strcpy
. Will Ghidra still be helpful in identifying 'buffer blasting' vulnerabilities, or will analysis fail when it reaches the first vector load instruction?Another set of exemplars depends on gcc-14 or later releases - riscv binaries compiled with autovectorization enabled.
The gcc testsuite files under gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter
would make decent examples. Note that this requires compilation flags like -march=rv64gcv_zvfh -mabi=lp64d -O3 --param riscv-autovec-preference=scalable -fno-vect-cost-model -ffast-math
. These additional exemplars should probably get deferred until closer to gcc-14 release, maybe mid 2024.
We still need to find riscv vector exemplars that make sense in network appliances. Vector implementations of AES-GCM would be useful. vector algorithms for hash table and tree map lookups would be especially nice, as they could be used in sessionization of inbound IP and MPLS packets. Vector instructions might improve throughput of some operations by a factor of 2 or 3, but they won't automagically fix memory bandwidth limits inherent in a riscv system. Vector solutions will likely increase latency while enabling higher throughput, a tradeoff that depends on the application.
Commit 1da6a56c5 adds some of the THead vendor-specific ISA extensions described in https://github.com/T-head-Semi/thead-extension-spec/releases/download/2.0.0/xthead-2022-09-05-2.0.0.pdf and implemented in binutils 2-41.
Consider adding 32 bit RISCV exemplars. This may include:
It would be especially nice to find an examplar showing how 64 bit and 32 bit cores might communicate with one another.
What will Ghidra do with vectorized loops on processors with RISCV-64 vector extensions? We can find out by adding exemplars drawn from the unreleased GCC-14 RISCV-64 vectorization test suite. For that we need a toolchain and platform based on what we expect to see in mid 2024, when GCC-14 should be released. This platform will mutate quickly, based on the development tip of GCC, binutils, and glibc - all cast into a Bazel 7.0 build environment.
Add some rust binaries, compiled with both llvm and gcc, and ideally for both x86_64 and riscv-64 processors. An initial exercise might involve matching rust strings with logging and assertion calls, for instance:
assert_eq!(
expected_count,
2,
"Resource counter should be two at this point"
log::info!("Done with primitives")
log::info!(
"bad_result.is_err(): {:?}",
bad_result.is_err()
The gccrs rust compiler can't handle macros from std yet, so defer rust exemplars.
The gcc-14 developmental toolchain is in place. We probably want to add simple C exemplars in groups to show:
The path forward for Ghidra imports should track the path taken by GCC toolchains. There isn't much point in getting too far ahead of what the compiler typically does. This suggests:
-O2
instead of something more detailed. Use -march=x86-64-v3
instead of -march=sapphirerapids
if we need an Intel reference.memcpy
is much more likely to be inlined than strncmp
, so we should be able to recognize the 6 or so different instruction sequences generated by gcc when it expands its cpymem
RTL instruction.So the next set of exemplars will be built around GCC testsuite code validating memcpy
-like operations, showing the two types of C source code most likely to be translated into cpymem
RTL instructions and the 6 or so instruction patterns generated with the various RISCV march
common options. The result should be a human-readable document helping Ghidra users recognize inlined cpymem
expansions.
The next set of exemplars cover short term, medium term, and longer term issues. We want to understand how soon extension-based optimizations will be a problem for Ghidra, and how much time it might take to work up semantic hints that may address those problems.
vset
instructions followed by vector load and store instructions, possibly within a short loop. These vector instructions show more diversity than one might expect, possibly to replicate scalar alignment exception handling. Ghidra users likely don't need to care about that diversityIf possible, we should develop each of these binary exemplars for a vanilla gcc-14 -O2
optimization, gcc-14 vector + bit manipulation optimization extensions, and a gcc-14 vector + bit manipulation + THead extensions.
Close this as we have enough exemplars to go forward.. Tests show some minor gaps in the Ghidra 11 isa_ext branch.
Determine the next set of exemplars - and design questions - to tackle next. The meta-issue involves guessing where new Ghidra capability in analyzing RISCV-64 evolution will make a difference, say in 2026 when higher performance RISCV-64 cores appear in network appliances. We'll continue to prioritize network over machine learning and general server extensions, so things like half-precision floating point will get less attention than topics like fast-hashing or advanced management of many-core access to IO memory and cache hierarchies. Exemplars that resemble a RISCV-64 alternative to Xtensa or Cavium Network Processing Unit designs are good, if we can find any.
Possible topics to explore next include:
libssl
libc
, for example implementing faster versions ofmemmov
andstrncpy
.Possible priorities include: