petrpavlu / valgrind-riscv64

Valgrind with support for the RISCV64/Linux platform.
GNU General Public License v2.0
56 stars 15 forks source link

support fence.i #21

Open mingyuan-xia opened 2 months ago

mingyuan-xia commented 2 months ago

19 partially

sorry for the inconsistent commit log, should be: riscv64: Support fence.i The project-level README is left unmodified.

mingyuan-xia commented 2 months ago

Note: fence.i is kind of equivalent to ARM's isb. ARM's toIR emits an Imbe_Fence node upon isb so I did the same for fence.i. ref: https://github.com/petrpavlu/valgrind-riscv64/blob/riscv64-linux/VEX/priv/guest_arm_toIR.c#L16012 Yet, I am not totally sure, as fence.i is stronger than Imbe_Fence's semantic specified in VEX IR.

Additionally, a quick search shows that the unprivileged fence.i is not common in userspace programs, even for a JIT like OpenJDK, see: https://github.com/openjdk/jdk/pull/9770

milkybird98 commented 2 months ago

In the ARM backend, the Imbe_Fence node corresponds to three barrier instructions, including an ISB instruction to ensure instruction synchronization, similar to what fence.i did:

https://github.com/petrpavlu/valgrind-riscv64/blob/8864c92efb8f65d8a56153f3bc8f7d2ac907b6cb/VEX/priv/host_arm64_defs.c#L4338-L4343

This means that the ARM backend has strengthened the semantics of the Imbe_Fence node.

However, in the RISC-V backend, the Imbe_Fence node corresponds to a single fence instruction for all types of device I/O and memory accesses.

https://github.com/petrpavlu/valgrind-riscv64/blob/d087725c0d68908bdebe57c528ac86d3dc11418b/VEX/priv/host_riscv64_defs.c#L2314-L2318

We might consider doing something similar, or perhaps introduce a new Imbe_Fence_I? node with correct semantics.

petrpavlu commented 2 months ago

Additionally, a quick search shows that the unprivileged fence.i is not common in userspace programs, even for a JIT like OpenJDK, see: openjdk/jdk#9770

Right, I wouldn't expect to see FENCE.I used in userspace applications. The RISC-V Instruction Set Manual Volume I: Unprivileged Architecture has a note on it:

Because FENCE.I only orders stores with a hart’s own instruction fetches, application code should only rely upon FENCE.I if the application thread will not be migrated to a different hart. The EEI can provide mechanisms for efficient multiprocessor instruction-stream synchronization.

In the ARM backend, the Imbe_Fence node corresponds to three barrier instructions, including an ISB instruction to ensure instruction synchronization, similar to what fence.i did: [...] We might consider doing something similar, or perhaps introduce a new Imbe_Fence_I? node with correct semantics.

I think that from the perspective of the VCPU that Valgrind implements, FENCE.I would effectively need to result in discarding all translations. Something as Ijk_InvalICache with guest_CMSTART and guest_CMLEN covering the whole address range.

mingyuan-xia commented 2 months ago

SMC (self-modifying code, somewhat seen in malware/injected code/thunk) is one more use case of fence.i, which changes the code near PC. But again, I am not convinced to support such cases with great efforts (changing backend/incurring great runtime overhead for other cases).

Based on the very rare use cases listed, I would recommend emitting an Imbe_Fence with a warning. Consider to run instead of crash on some random fence.i-s emitted by an old compiler/JIT, like the OpenJDK above-mentioned. After all, fence.i is a standardized instruction allowable in userspace. Would like to hear your opinion on this @petrpavlu.

petrpavlu commented 1 month ago

I'm somewhat in favor of implementing this instruction properly if it is added. Turning it into Ijk_InvalICache should be fairly simple, without a need for any backend changes.

Userspace programs shouldn't really need to use this instruction, but instead need to invoke the riscv_flush_icache() syscall. IMO the instruction is then not worth optimizing for and I wouldn't worry about the penalty incurred by Ijk_InvalICache.