Design a test strategy for pcode semantics

thixotropist commented 4 months ago

Many ISA extension instructions require custom pcodes in their processor's *.sinc files. The latest round of commits to Ghidra's isa_ext branch includes many corrections to simple errors. What kind of import testing would help reduce those errors?

This is complicated by the lack of reference models for ideal pcode generation. The current method is subjective - minimize the confusion a Ghidra user will experience when examining functions in the decompiler window.

The initial model for generating pcode is to generate pcode functions that look like the assembly instruction, with '_' replacing '.', and with function arguments in the same order as binutils presents them in a binary objdump. Many vector pcode instances currently have a variable number of arguments to allow for mask variants. This is not how riscv vector intrinsics handles masked operations.

This issue is intended to collect options for improved pcode examples to test against, along with estimates of the implementation complexity.

Options include:

Keep the existing pcode opcodes and let the Ghidra user become familiar with common autovectorization patterns. This likely minimizes Ghidra code churn.
Use standard compiler C intrinsics as examples of what we want to see in the decompiler window. Ghidra's pcode signatures should look much like these intrinsic functions.
Attempt to undo compiler autovectorization and loop unrolling, such that the decompiler window looks much like the original source code.

Option 3 is likely much too ambitious for now. Option 2 is ambitious and likely premature, but we can take small steps in that direction. Can we transform pcode ops into notional function calls, so that we can propagate parameter type and vector context information as if they were first-class function calls?

thixotropist commented 2 months ago

Picking the right goals to test pcode semantics against may be a bigger issue.

pcode should help the user better understand the decompilation results. It does not need to be precise.
pcode should enable the Ghidra emulator. It should be as precise as the qemu riscv emulator, possibly excluding exception handling.

For example, the riscv half-precision floating point extension zfa support in Ghidra would be a tough test case if full Ghidra emulation was required.

thixotropist commented 1 month ago

Let's start RISCV semantics testing with floating point conversions, since @jobermayr has done some nice cleanup there. Start by generating some unit tests exercised by a RISCV-64 gtest framework executing within a qemu-riscv-static emulator.

How close does Ghidra come to displaying something sensible in the decompiler window?
How does the Ghidra emulator compare to the qemu emulator?

For example, compile, RE, and execute under qemu the following:

float fcvt_s_w(int* i) {
    return (float)*i;
}
float fcvt_s_wu(unsigned int* i) {
    return (float)*i;
}
int fmv_x_w(float* x) {
    int val;
    float src = *x;
    __asm__ __volatile__ (
        "fmv.x.w  %0, %1" \
        : "=r" (val) \
        : "f" (src));
    return val;
}
...
int i = 1;
unsigned int iu = 1;
 unsigned int iumax = 0xffffffff;
EXPECT_NEAR(1.0, fcvt_s_w(&i), 1.0e-5) << "fcvt.s.w failure";
EXPECT_NEAR(1.0, fcvt_s_wu(&iu), 1.0e-5) << "fcvt.s.wu failure";
EXPECT_NEAR(4294967295.0, fcvt_s_wu(&iumax), 1.0) << "fcvt.s.wu failure on large unsigned ints";
...
float x = 1.0;
int x_as_int = 0x3f800000;
EXPECT_EQ(x_as_int, fmv_x_w(&x)) << "fmv.x.w failure";
EXPECT_NEAR(1.0, fmv_w_x(&x_as_int), 1.0e-5) << "fmv.w.x failure";

These unit tests all pass under qemu-riscv-static.

Ghidra's current (11.2-DEV) decompilation gives:

float fcvt_s_w(int *param_1)
{
  return (float)*param_1;                  // looks good
}

float fcvt_s_wu(uint *param_1)
{
  return (float)ZEXT416(*param_1);  // The ZEXT416 pcode op is incorrect here
}

long fmv_x_w(int param_1)              // param1 is the wrong type, and any signature fix will likely force
                                                              // an unwanted implicit cast
{
  return (long)param_1;
}

jobermayr commented 1 month ago

Let's start RISCV semantics testing with floating point conversions, since @jobermayr has done some nice cleanup there.

Sorry. I don't want to get all the credit. It was @Sleigh-InSPECtor. I just got a build error while trying to add https://github.com/NationalSecurityAgency/ghidra/pull/6492 to https://github.com/jobermayr/ghidra-staging ...

thixotropist commented 1 month ago

@jobermayr: Thanks for the clarification. What do you think the Ghidra emulator should do with the following code:

double xdNan = std::numeric_limits<double>::quiet_NaN();
fcvt_s_d(&xdNan);

Qemu properly tests for NaN inputs and throws an exception, which you can test and catch with

EXPECT_DEATH(fcvt_s_d(&xdNan), "") << "fcvt.s.d failure on NaN conversion";

I'm not saying Ghidra's emulator should be as faithful to the hardware spec - just that knowing where to draw the line would be nice. Divide by zero? Unaligned reads and writes?

thixotropist commented 3 weeks ago

A provisional design for pcode testing specific to ISA extensions:

create a library of function calls wrapping individual instructions
- import this library into Ghidra, examining each of those simple function calls for clarity in the decompiler window
add a test framework exercising each of those functions with various inputs
run the test framework under a known emulator, qemu-static or qemu-system, verifying that we are expecting correct results.
run the test framework under the Ghidra emulator, comparing the results with those of the qemu emulators.
estimate the level of effort needed to bring the Ghidra emulator up to qemu emulator behavior

The code under riscv64/generated/emulator_tests shows a sample first cut at this, taking on the 16 bit floating point RISCV64 extension. The qemu-static and qemu-system results look mostly good. Getting the same level of fidelity with Ghidra would take a massive amount of work, as it has no built-in support for 16 bit floats.

thixotropist / ghidra_import_tests

Design a test strategy for pcode semantics #17