Open Antwy opened 2 months ago
Awesome. Thanks for a such MR. Let me few weeks to review this. Can you try to fix CIs?
@cnheitman can you take a look at this too so that we have at least two reviews for a such MR?
Awesome! Great PR!
@JonathanSalwan Yes, I'll review it, most probably sometime next week.
Well, I guess one way to fix CI is to update Capstone version from 4.0.2 to 5.0+. But if you need the older version, I can try to put riscv code under defines
The PR looks good, great work @Antwy o/
This PR was based on master (I think) and does not include the commit upgrading Bitwuzla from v0.2.0 to v0.4.0 of dev-v1.0. However, there should be no conflicts rebasing on top of that change.
I did not dive much into the semantics file. On a quick overview they look good. If we want to increase our confidence on the code, we can add some more tests (for instance, we have a binary with different optimization levels which we use to test the ARM semantics). However, basic unittests were included so it seems fine for now.
Ideally, I would add a RV32 and RV64 version of this custom crackme so we can have an example of a full working binary, as we do with AARCH64 and ARM.
Regarding the CI and Capstone. I think we can move on to version 5.0.1. Version 4.0.2 is from 2020 (iirc) and as far as I can tell we don't have any specific reason to keep supporting it.
Now rebased on dev-v1.0 and some semantics issues fixed (but still not sure in taint spreading variants).
Adding binary test seems reasonable as basic unittest suite doesn't cover controlflow transfers.
Added the crackme binary test
Might be worthwhile to take a look at the official RISC-V ISA tests: https://github.com/riscv-software-src/riscv-tests
I wrote a small python script to process the compiled files and generate a data.h
that contains enough information to start emulation: https://github.com/thesecretclub/riscy-business/blob/master/riscvm/generate-isa-tests.py. You just have to modify this chunk of code to set up the triton context and start emulation:
https://github.com/thesecretclub/riscy-business/blob/master/riscvm/tests.cpp#L44-L55
The reason for the generate-isa-tests.py
is that the test executables have some common setup, which requires support for supervisor instructions/paging so I extract just the raw RV64I code and run that directly (since the setup can be done on the emulator host side).
The process should be similar for the 32-bit ISA tests, but I didn't compile or try them. It helped me a lot to find bugs in my emulator, so it was definitely worth the effort in my opinion! You also do not need unicorn anymore (because given their track record their semantics are probably incorrect), or at least it can be an additional test.
Hi! I've played around with this PR and seems to have found a bug. According to RISCV ISA manual the SLL
instruction in RV64I
should perform a logical shift left by the shift amount from the lower 6 bits held in the third operand (this also applies to other similar instructions).
In RV64I, only the low 6 bits of rs2 are considered for the shift amount.
However, in the current implementation only the lower 5 bits are used.
This example demonstrates the problem:
from unicorn.riscv_const import *
from unicorn import *
from triton import *
CODE_START = 0x0
def emu_unicorn():
# sll s8, s7, a7
opcode = b"\x33\x9c\x1b\x01"
mu = Uc(UC_ARCH_RISCV, UC_MODE_RISCV64)
mu.mem_map(CODE_START, 0x1000)
mu.mem_write(CODE_START, opcode)
mu.reg_write(UC_RISCV_REG_A7, 0x69C99AB9B9401024)
mu.reg_write(UC_RISCV_REG_S7, 0xDB4D6868655C3585)
mu.reg_write(UC_RISCV_REG_PC, CODE_START)
mu.emu_start(CODE_START, CODE_START + len(opcode))
print("(UC) s8 = 0x%x" % mu.reg_read(UC_RISCV_REG_S8))
def emu_triton():
# sll s8, s7, a7
opcode = b"\x33\x9c\x1b\x01"
ctx = TritonContext()
ctx.setArchitecture(ARCH.RV64)
inst = Instruction()
inst.setOpcode(opcode)
inst.setAddress(CODE_START)
ctx.setConcreteRegisterValue(ctx.registers.x17, 0x69C99AB9B9401024)
ctx.setConcreteRegisterValue(ctx.registers.x23, 0xDB4D6868655C3585)
ctx.processing(inst)
print("(TT) s8 = 0x%x" % ctx.getConcreteRegisterValue(ctx.registers.x24))
def main():
emu_unicorn()
emu_triton()
if __name__ == "__main__":
main()
Good catch, @m4drat! Thanks!
Modified my generation script to generate a python file that can be used for the tests:
from elftools.elf.elffile import ELFFile
from elftools.elf.sections import SymbolTableSection
import os
import zlib
def parse_test_elf(file):
with open(file, "rb") as f:
elf = ELFFile(f)
# Enumerate the SymbolTableSection
for section in elf.iter_sections():
if isinstance(section, SymbolTableSection):
for i in range(section.num_symbols()):
symbol = section.get_symbol(i)
if symbol.name:
if symbol.name.startswith("test_"):
address = symbol.entry.st_value
# Convert address to file offset
offset = list(elf.address_offsets(address))[0]
return address, offset
return None, None
def main():
tests = []
directory = "isa-tests"
code = "import zlib\n\n"
for file in sorted(os.listdir(directory)):
if file.startswith("rv64") and not file.endswith(".dump"):
path = os.path.join(directory, file)
address, offset = parse_test_elf(path)
if offset is None:
print(f"Failed to parse {file}")
continue
data = f"__{file.replace('-', '_').upper()}_DATA = zlib.decompress(b\""
with open(path, "rb") as f:
for byte in zlib.compress(f.read(), 9):
data += f"\\x{byte:02x}"
data += "\")\n"
code += data
tests.append((file, address, offset))
code += "\n"
code += "TESTS = [\n"
for name, address, offset in tests:
variable = f"__{name.replace('-', '_').upper()}_DATA"
code += f" (\"{name}\", {variable}, {hex(address)}, {hex(offset)}),\n"
code += "]\n"
with open("isa-tests/data.py", "wb") as f:
f.write(code.encode("utf-8"))
if __name__ == "__main__":
main()
The generated data.py
: data.py.zip (384 kb)
@Antwy
Hm, I might be wrong on this one, but I couldn't see in the code whether you handle RV32/RV64 cases differently. Because the shift amount should depend on the target. 6 bits - RV64, 5 bits - RV32.
To be precise, here are the quotes:
SLL, SRL, and SRA perform logical left, logical right, and arithmetic right shifts on the value in register rs1 by the shift amount held in register rs2. In RV64I, only the low 6 bits of rs2 are considered for the shift amount.
SLL, SRL, and SRA perform logical left, logical right, and arithmetic right shifts on the value in register rs1 by the shift amount held in the lower 5 bits of register rs2.
Spent some time rigging up the official ISA tests on top of this PR:
import sys
from struct import pack
from triton import *
from riscv64_data import TESTS as RV64TESTS
def emulate_test(name: str, binary: bytes, address: int, offset: int, trace: bool):
# initial state
STACK = 0x200000
istate = {
"stack": bytearray(b"".join([pack('B', 255 - i) for i in range(256)])),
"heap": bytearray(b"".join([pack('B', i) for i in range(256)])),
"x0": 0x0,
"x1": 0x0,
"x2": STACK,
"x3": 0x0,
"x4": 0x0,
"x5": 0x0,
"x6": 0x0,
"x7": 0x0,
"x8": 0x0,
"x9": 0x0,
"x10": 0x0,
"x11": 0x0,
"x12": 0x0,
"x13": 0x0,
"x14": 0x0,
"x15": 0x0,
"x16": 0x0,
"x17": 0x0,
"x18": 0x0,
"x19": 0x0,
"x20": 0x0,
"x21": 0x0,
"x22": 0x0,
"x23": 0x0,
"x24": 0x0,
"x25": 0x0,
"x26": 0x0,
"x27": 0x0,
"x28": 0x0,
"x29": 0x0,
"x30": 0x0,
"x31": 0x0,
"f0": 0x00112233445566778899aabbccddeeff,
"f1": 0xffeeddccbbaa99887766554433221100,
"f2": 0xfefedcdc5656787889892692dfeccaa0,
"f3": 0x1234567890987654321bcdffccddee01,
"f4": 0x0,
"f5": 0x0,
"f6": 0x0,
"f7": 0x0,
"f8": 0x0,
"f9": 0x0,
"f10": 0x0,
"f11": 0x0,
"f12": 0x0,
"f13": 0x0,
"f14": 0x0,
"f15": 0x0,
"f16": 0x0,
"f17": 0x0,
"f18": 0x0,
"f19": 0x0,
"f20": 0x0,
"f21": 0x0,
"f22": 0x0,
"f23": 0x0,
"f24": 0x0,
"f25": 0x0,
"f26": 0x0,
"f27": 0x0,
"f28": 0x0,
"f29": 0x0,
"f30": 0x0,
"f31": 0x0,
"pc": address,
}
ctx = TritonContext()
ctx.setArchitecture(ARCH.RV64)
ctx.setConcreteMemoryAreaValue(STACK, bytes(istate['stack']))
ctx.setConcreteMemoryAreaValue(address, binary[offset:])
ctx.setConcreteRegisterValue(ctx.registers.x0, 0)
ctx.setConcreteRegisterValue(ctx.registers.x1, istate['x1'])
ctx.setConcreteRegisterValue(ctx.registers.x2, istate['x2'])
ctx.setConcreteRegisterValue(ctx.registers.x3, istate['x3'])
ctx.setConcreteRegisterValue(ctx.registers.x4, istate['x4'])
ctx.setConcreteRegisterValue(ctx.registers.x5, istate['x5'])
ctx.setConcreteRegisterValue(ctx.registers.x6, istate['x6'])
ctx.setConcreteRegisterValue(ctx.registers.x7, istate['x7'])
ctx.setConcreteRegisterValue(ctx.registers.x8, istate['x8'])
ctx.setConcreteRegisterValue(ctx.registers.x9, istate['x9'])
ctx.setConcreteRegisterValue(ctx.registers.x10, istate['x10'])
ctx.setConcreteRegisterValue(ctx.registers.x11, istate['x11'])
ctx.setConcreteRegisterValue(ctx.registers.x12, istate['x12'])
ctx.setConcreteRegisterValue(ctx.registers.x13, istate['x13'])
ctx.setConcreteRegisterValue(ctx.registers.x14, istate['x14'])
ctx.setConcreteRegisterValue(ctx.registers.x15, istate['x15'])
ctx.setConcreteRegisterValue(ctx.registers.x16, istate['x16'])
ctx.setConcreteRegisterValue(ctx.registers.x17, istate['x17'])
ctx.setConcreteRegisterValue(ctx.registers.x18, istate['x18'])
ctx.setConcreteRegisterValue(ctx.registers.x19, istate['x19'])
ctx.setConcreteRegisterValue(ctx.registers.x20, istate['x20'])
ctx.setConcreteRegisterValue(ctx.registers.x21, istate['x21'])
ctx.setConcreteRegisterValue(ctx.registers.x22, istate['x22'])
ctx.setConcreteRegisterValue(ctx.registers.x23, istate['x23'])
ctx.setConcreteRegisterValue(ctx.registers.x24, istate['x24'])
ctx.setConcreteRegisterValue(ctx.registers.x25, istate['x25'])
ctx.setConcreteRegisterValue(ctx.registers.x26, istate['x26'])
ctx.setConcreteRegisterValue(ctx.registers.x27, istate['x27'])
ctx.setConcreteRegisterValue(ctx.registers.x28, istate['x28'])
ctx.setConcreteRegisterValue(ctx.registers.x29, istate['x29'])
ctx.setConcreteRegisterValue(ctx.registers.x30, istate['x30'])
ctx.setConcreteRegisterValue(ctx.registers.x31, istate['x31'])
ctx.setConcreteRegisterValue(ctx.registers.f0, istate['f0'])
ctx.setConcreteRegisterValue(ctx.registers.f1, istate['f1'])
ctx.setConcreteRegisterValue(ctx.registers.f2, istate['f2'])
ctx.setConcreteRegisterValue(ctx.registers.f3, istate['f3'])
ctx.setConcreteRegisterValue(ctx.registers.f4, istate['f4'])
ctx.setConcreteRegisterValue(ctx.registers.f5, istate['f5'])
ctx.setConcreteRegisterValue(ctx.registers.f6, istate['f6'])
ctx.setConcreteRegisterValue(ctx.registers.f7, istate['f7'])
ctx.setConcreteRegisterValue(ctx.registers.f8, istate['f8'])
ctx.setConcreteRegisterValue(ctx.registers.f9, istate['f9'])
ctx.setConcreteRegisterValue(ctx.registers.f10, istate['f10'])
ctx.setConcreteRegisterValue(ctx.registers.f11, istate['f11'])
ctx.setConcreteRegisterValue(ctx.registers.f12, istate['f12'])
ctx.setConcreteRegisterValue(ctx.registers.f13, istate['f13'])
ctx.setConcreteRegisterValue(ctx.registers.f14, istate['f14'])
ctx.setConcreteRegisterValue(ctx.registers.f15, istate['f15'])
ctx.setConcreteRegisterValue(ctx.registers.f16, istate['f16'])
ctx.setConcreteRegisterValue(ctx.registers.f17, istate['f17'])
ctx.setConcreteRegisterValue(ctx.registers.f18, istate['f18'])
ctx.setConcreteRegisterValue(ctx.registers.f19, istate['f19'])
ctx.setConcreteRegisterValue(ctx.registers.f20, istate['f20'])
ctx.setConcreteRegisterValue(ctx.registers.f21, istate['f21'])
ctx.setConcreteRegisterValue(ctx.registers.f22, istate['f22'])
ctx.setConcreteRegisterValue(ctx.registers.f23, istate['f23'])
ctx.setConcreteRegisterValue(ctx.registers.f24, istate['f24'])
ctx.setConcreteRegisterValue(ctx.registers.f25, istate['f25'])
ctx.setConcreteRegisterValue(ctx.registers.f26, istate['f26'])
ctx.setConcreteRegisterValue(ctx.registers.f27, istate['f27'])
ctx.setConcreteRegisterValue(ctx.registers.f28, istate['f28'])
ctx.setConcreteRegisterValue(ctx.registers.f29, istate['f29'])
ctx.setConcreteRegisterValue(ctx.registers.f30, istate['f30'])
ctx.setConcreteRegisterValue(ctx.registers.f31, istate['f31'])
pc = istate['pc']
for i in range(1000):
ctx.setConcreteRegisterValue(ctx.registers.pc, pc)
opcode = ctx.getConcreteMemoryValue(MemoryAccess(pc, CPUSIZE.DWORD))
opcode_bytes = pack('<I', opcode)
inst = Instruction(opcode_bytes)
inst.setAddress(pc)
state = ctx.processing(inst)
if trace:
print(inst)
if state == EXCEPTION.NO_FAULT:
pc = ctx.getConcreteRegisterValue(ctx.registers.pc)
else:
disasm = inst.getDisassembly()
if "fence" in disasm:
# HACK: ignore the unsupported fence instruction
pc += 4
elif "ecall" in disasm:
syscall_index = ctx.getConcreteRegisterValue(ctx.registers.x17)
#assert syscall_index == 139, f"invalid syscall: {syscall_index}"
return ctx.getConcreteRegisterValue(ctx.registers.x10)
else:
raise Exception(f"{inst} -> exception {state}")
return -1
if __name__ == "__main__":
success = 0
for name, binary, address, offset in RV64TESTS:
exit_code = emulate_test(name, binary, address, offset, trace=False)
if exit_code == 0:
print(f"SUCCESS: {name}")
success += 1
else:
print(f"FAILURE: {name}, {exit_code}")
print(f"\n{success}/{len(RV64TESTS)} passed")
Unfortunately the success rate isn't the best, here is the output:
FAILURE: rv64ui-p-add, 46
FAILURE: rv64ui-p-addi, 83
FAILURE: rv64ui-p-addiw, 83
FAILURE: rv64ui-p-addw, 46
SUCCESS: rv64ui-p-and
FAILURE: rv64ui-p-andi, 15
SUCCESS: rv64ui-p-auipc
SUCCESS: rv64ui-p-beq
SUCCESS: rv64ui-p-bge
SUCCESS: rv64ui-p-bgeu
SUCCESS: rv64ui-p-blt
SUCCESS: rv64ui-p-bltu
SUCCESS: rv64ui-p-bne
SUCCESS: rv64ui-p-fence_i
SUCCESS: rv64ui-p-jal
FAILURE: rv64ui-p-jalr, 15
SUCCESS: rv64ui-p-lb
SUCCESS: rv64ui-p-lbu
SUCCESS: rv64ui-p-ld
SUCCESS: rv64ui-p-lh
SUCCESS: rv64ui-p-lhu
FAILURE: rv64ui-p-lui, 18446744071562067968
SUCCESS: rv64ui-p-lw
SUCCESS: rv64ui-p-lwu
FAILURE: rv64ui-p-ma_data, -1
FAILURE: rv64ui-p-or, 858993459
FAILURE: rv64ui-p-ori, 16713727
SUCCESS: rv64ui-p-sb
SUCCESS: rv64ui-p-sd
SUCCESS: rv64ui-p-sh
SUCCESS: rv64ui-p-simple
FAILURE: rv64ui-p-sll, 1024
FAILURE: rv64ui-p-slli, 34603008
FAILURE: rv64ui-p-slliw, 13
FAILURE: rv64ui-p-sllw, 13
FAILURE: rv64ui-p-slt, 1
SUCCESS: rv64ui-p-slti
FAILURE: rv64ui-p-sltiu, 1
FAILURE: rv64ui-p-sltu, 1
FAILURE: rv64ui-p-sra, 1024
SUCCESS: rv64ui-p-srai
SUCCESS: rv64ui-p-sraiw
FAILURE: rv64ui-p-sraw, 1024
FAILURE: rv64ui-p-srl, 1024
SUCCESS: rv64ui-p-srli
FAILURE: rv64ui-p-srliw, 7
FAILURE: rv64ui-p-srlw, 7
FAILURE: rv64ui-p-sub, 18446744073709551602
FAILURE: rv64ui-p-subw, 18446744073709551602
SUCCESS: rv64ui-p-sw
FAILURE: rv64ui-p-xor, 858993459
FAILURE: rv64ui-p-xori, 16713712
FAILURE: rv64um-p-div, 17
FAILURE: rv64um-p-divu, 17
FAILURE: rv64um-p-divuw, 17
FAILURE: rv64um-p-divw, 17
FAILURE: rv64um-p-mul, 1122
FAILURE: rv64um-p-mulh, 1122
FAILURE: rv64um-p-mulhsu, 1122
FAILURE: rv64um-p-mulhu, 1122
FAILURE: rv64um-p-mulw, 1122
FAILURE: rv64um-p-rem, 17
FAILURE: rv64um-p-remu, 17
FAILURE: rv64um-p-remuw, 17
FAILURE: rv64um-p-remw, 17
26/65 passed
Might be I set something up incorrectly though, but the register names are not matching the disassembly so it's difficult to debug/trace without creating additional mapping etc.
@mrexodia Well, adding to current testsuite debug printing and lines from rv64ui-lui which has status 'FAILURE' I've got:
Instruction: 0x10011a: lui ra, 0
x0: 0x0
x1: 0x0
----------------
[OK] lui x1, #0x00000
-------------------------------
Instruction: 0x10011e: lui ra, 0xfffff
x0: 0x0
x1: 0xfffffffffffff000
----------------
[OK] lui x1, #0xfffff
-------------------------------
Instruction: 0x100122: srai ra, ra, 1
x0: 0x0
x1: 0xfffffffffffff800
----------------
[OK] srai x1, x1, #1
-------------------------------
Instruction: 0x100126: lui ra, 0x7ffff
x0: 0x0
x1: 0x7ffff000
----------------
[OK] lui x1, #0x7ffff
-------------------------------
Instruction: 0x10012a: srai ra, ra, 0x14
x0: 0x0
x1: 0x7ff
----------------
[OK] srai x1, x1, #20
-------------------------------
Instruction: 0x10012e: lui ra, 0x80000
x0: 0x0
x1: 0xffffffff80000000
----------------
[OK] lui x1, #0x80000
-------------------------------
Instruction: 0x100132: srai ra, ra, 0x14
x0: 0x0
x1: 0xfffffffffffff800
----------------
[OK] srai x1, x1, #20
-------------------------------
Instruction: 0x100136: lui zero, 0
x0: 0x0
x1: 0xfffffffffffff800
----------------
[OK] lui x0, #0x80000
These seem like equal to test expected values. Maybe the result above is caused by parsing issues in case of more than one instruction in testcase or instruction with immediate operand written without "i".
The test lines in src/testers/riscv/unicorn_test_riscv64.py:
(b"\xb7\x00\x00\x00", "lui x1, #0x00000"),
(b"\xb7\xf0\xff\xff", "lui x1, #0xfffff"),
(b"\x93\xd0\x10\x40", "srai x1, x1, #1"),
(b"\xb7\xf0\xff\x7f", "lui x1, #0x7ffff"),
(b"\x93\xd0\x40\x41", "srai x1, x1, #20"),
(b"\xb7\x00\x00\x80", "lui x1, #0x80000"),
(b"\x93\xd0\x40\x41", "srai x1, x1, #20"),
(b"\x37\x00\x00\x00", "lui x0, #0x80000"),
and the debug printing right after ctx.processing(inst):
print("Instruction: ", inst)
print("x0: ", hex(ctx.getSymbolicRegisterValue(ctx.registers.x0)))
print("x1: ", hex(ctx.getSymbolicRegisterValue(ctx.registers.x1)))
@Antwy
Stumbled upon another corner-case for the REMW
instruction:
from unicorn.riscv_const import *
from unicorn import *
from triton import *
CODE_START = 0x0
def emu_unicorn():
# remw s0, s5, t0
opcode = b"\x3b\xe4\x5a\x02"
mu = Uc(UC_ARCH_RISCV, UC_MODE_RISCV64)
mu.mem_map(CODE_START, 0x1000)
mu.mem_write(CODE_START, opcode)
mu.reg_write(UC_RISCV_REG_S5, 0x917665C427EBEE5D)
mu.reg_write(UC_RISCV_REG_T0, 0x0000000000000000)
mu.reg_write(UC_RISCV_REG_PC, CODE_START)
mu.emu_start(CODE_START, CODE_START + len(opcode))
print("(UC) s0 = 0x%x" % mu.reg_read(UC_RISCV_REG_S0))
def emu_triton():
# remw s0, s5, t0
opcode = b"\x3b\xe4\x5a\x02"
ctx = TritonContext()
ctx.setArchitecture(ARCH.RV64)
inst = Instruction()
inst.setOpcode(opcode)
inst.setAddress(CODE_START)
ctx.setConcreteRegisterValue(ctx.registers.x21, 0x917665C427EBEE5D)
ctx.setConcreteRegisterValue(ctx.registers.x5, 0x0000000000000000)
ctx.processing(inst)
print("(TT) s0 = 0x%x" % ctx.getConcreteRegisterValue(ctx.registers.x8))
def main():
emu_unicorn()
emu_triton()
if __name__ == "__main__":
main()
According to ISA Manual, in case of division by 0, the result of the operation should be equal to the lowest 32-bits of the dividend, not 0.
The semantics for division by zero and division overflow are summarized in Table 11. The quotient of division by zero has all bits set, and the remainder of division by zero equals the dividend. Signed division overflow occurs only when the most-negative integer is divided by . The quotient of a signed division with overflow is equal to the dividend, and the remainder is zero. Unsigned division overflow cannot occur.
@mrexodia Well, adding to current testsuite debug printing and lines from rv64ui-lui which has status 'FAILURE' I've got:
Instruction: 0x10011a: lui ra, 0 x0: 0x0 x1: 0x0 ---------------- [OK] lui x1, #0x00000 ------------------------------- Instruction: 0x10011e: lui ra, 0xfffff x0: 0x0 x1: 0xfffffffffffff000 ---------------- [OK] lui x1, #0xfffff ------------------------------- Instruction: 0x100122: srai ra, ra, 1 x0: 0x0 x1: 0xfffffffffffff800 ---------------- [OK] srai x1, x1, #1 ------------------------------- Instruction: 0x100126: lui ra, 0x7ffff x0: 0x0 x1: 0x7ffff000 ---------------- [OK] lui x1, #0x7ffff ------------------------------- Instruction: 0x10012a: srai ra, ra, 0x14 x0: 0x0 x1: 0x7ff ---------------- [OK] srai x1, x1, #20 ------------------------------- Instruction: 0x10012e: lui ra, 0x80000 x0: 0x0 x1: 0xffffffff80000000 ---------------- [OK] lui x1, #0x80000 ------------------------------- Instruction: 0x100132: srai ra, ra, 0x14 x0: 0x0 x1: 0xfffffffffffff800 ---------------- [OK] srai x1, x1, #20 ------------------------------- Instruction: 0x100136: lui zero, 0 x0: 0x0 x1: 0xfffffffffffff800 ---------------- [OK] lui x0, #0x80000
These seem like equal to test expected values. Maybe the result above is caused by parsing issues in case of more than one instruction in testcase or instruction with immediate operand written without "i".
The test lines in src/testers/riscv/unicorn_test_riscv64.py:
(b"\xb7\x00\x00\x00", "lui x1, #0x00000"), (b"\xb7\xf0\xff\xff", "lui x1, #0xfffff"), (b"\x93\xd0\x10\x40", "srai x1, x1, #1"), (b"\xb7\xf0\xff\x7f", "lui x1, #0x7ffff"), (b"\x93\xd0\x40\x41", "srai x1, x1, #20"), (b"\xb7\x00\x00\x80", "lui x1, #0x80000"), (b"\x93\xd0\x40\x41", "srai x1, x1, #20"), (b"\x37\x00\x00\x00", "lui x0, #0x80000"),
and the debug printing right after ctx.processing(inst):
print("Instruction: ", inst) print("x0: ", hex(ctx.getSymbolicRegisterValue(ctx.registers.x0))) print("x1: ", hex(ctx.getSymbolicRegisterValue(ctx.registers.x1)))
For my emulator I ran into bugs with the immediate loading. So the operation was correct, but certain encodings related to (shifted) immediates were not (especially the sign extension is very complicated). That might also be the case here…
Here are the traces from my emulator, might be helpful: rv64ui-traces.zip
@m4drat Thanks! Guess this one is fixed too. Please, let me know if you find anything else!
Can you fix vcpkg by adding the risc feature and updating Capstone version for Appveyor? Once all CIs are green, I will do a quick review and merge it to dev-v1.0
:)
https://vcpkg.link/ports/capstone/v/5.0.1/1
I think we also have to update capstone in vcpkg to switch from 5.0.0-rc2
to 5.0.1
@Antwy
Found a problem with SLLIW
instruction:
from unicorn.riscv_const import *
from unicorn import *
from triton import *
CODE_START = 0x0
def emu_unicorn():
# slliw t0, s4, 0xc
opcode = b"\x9b\x12\xca\x00"
mu = Uc(UC_ARCH_RISCV, UC_MODE_RISCV64)
mu.mem_map(CODE_START, 0x1000)
mu.mem_write(CODE_START, opcode)
mu.reg_write(UC_RISCV_REG_S4, 0x10ab95)
mu.reg_write(UC_RISCV_REG_T0, 0x000000)
mu.reg_write(UC_RISCV_REG_PC, CODE_START)
mu.emu_start(CODE_START, CODE_START + len(opcode))
print("(UC) s0 = 0x%x" % mu.reg_read(UC_RISCV_REG_T0))
def emu_triton():
# slliw t0, s4, 0xc
opcode = b"\x9b\x12\xca\x00"
ctx = TritonContext()
ctx.setArchitecture(ARCH.RV64)
inst = Instruction()
inst.setOpcode(opcode)
inst.setAddress(CODE_START)
ctx.setConcreteRegisterValue(ctx.registers.x20, 0x10ab95)
ctx.setConcreteRegisterValue(ctx.registers.x5, 0x000000)
ctx.processing(inst)
print("(TT) s0 = 0x%x" % ctx.getConcreteRegisterValue(ctx.registers.x5))
def main():
emu_unicorn()
emu_triton()
if __name__ == "__main__":
main()
Fixed. Thanks again, @m4drat :)
I re-ran the ISA tests and things are looking better!
FAILURE: rv64ui-p-add, 46
FAILURE: rv64ui-p-addi, 83
FAILURE: rv64ui-p-addiw, 83
FAILURE: rv64ui-p-addw, 46
SUCCESS: rv64ui-p-and
FAILURE: rv64ui-p-andi, 15
SUCCESS: rv64ui-p-auipc
SUCCESS: rv64ui-p-beq
SUCCESS: rv64ui-p-bge
SUCCESS: rv64ui-p-bgeu
SUCCESS: rv64ui-p-blt
SUCCESS: rv64ui-p-bltu
SUCCESS: rv64ui-p-bne
SUCCESS: rv64ui-p-fence_i
SUCCESS: rv64ui-p-jal
FAILURE: rv64ui-p-jalr, 15
SUCCESS: rv64ui-p-lb
SUCCESS: rv64ui-p-lbu
SUCCESS: rv64ui-p-ld
SUCCESS: rv64ui-p-lh
SUCCESS: rv64ui-p-lhu
FAILURE: rv64ui-p-lui, 18446744071562067968
SUCCESS: rv64ui-p-lw
SUCCESS: rv64ui-p-lwu
FAILURE: rv64ui-p-ma_data, -1
FAILURE: rv64ui-p-or, 858993459
FAILURE: rv64ui-p-ori, 16713727
SUCCESS: rv64ui-p-sb
SUCCESS: rv64ui-p-sd
SUCCESS: rv64ui-p-sh
SUCCESS: rv64ui-p-simple
FAILURE: rv64ui-p-sll, 1024
FAILURE: rv64ui-p-slli, 34603008
FAILURE: rv64ui-p-slliw, 18446744073441116160
FAILURE: rv64ui-p-sllw, 1024
FAILURE: rv64ui-p-slt, 1
SUCCESS: rv64ui-p-slti
FAILURE: rv64ui-p-sltiu, 1
FAILURE: rv64ui-p-sltu, 1
FAILURE: rv64ui-p-sra, 1024
SUCCESS: rv64ui-p-srai
SUCCESS: rv64ui-p-sraiw
FAILURE: rv64ui-p-sraw, 1024
FAILURE: rv64ui-p-srl, 1024
SUCCESS: rv64ui-p-srli
SUCCESS: rv64ui-p-srliw
FAILURE: rv64ui-p-srlw, 1024
FAILURE: rv64ui-p-sub, 18446744073709551602
FAILURE: rv64ui-p-subw, 18446744073709551602
SUCCESS: rv64ui-p-sw
FAILURE: rv64ui-p-xor, 858993459
FAILURE: rv64ui-p-xori, 16713712
SUCCESS: rv64um-p-div
SUCCESS: rv64um-p-divu
SUCCESS: rv64um-p-divuw
SUCCESS: rv64um-p-divw
FAILURE: rv64um-p-mul, 1122
FAILURE: rv64um-p-mulh, 1122
FAILURE: rv64um-p-mulhsu, 1122
FAILURE: rv64um-p-mulhu, 1122
FAILURE: rv64um-p-mulw, 1122
SUCCESS: rv64um-p-rem
SUCCESS: rv64um-p-remu
SUCCESS: rv64um-p-remuw
SUCCESS: rv64um-p-remw
35/65 passed
The problem is that you allow assignments to x0/zero
:
from unicorn.riscv_const import *
from unicorn import *
from triton import *
CODE_START = 0x0
def emu_unicorn():
# lui zero, 0x80000
# mv a0, zero
opcode = bytes.fromhex("37 00 00 80 13 05 00 00")
mu = Uc(UC_ARCH_RISCV, UC_MODE_RISCV64)
mu.mem_map(CODE_START, 0x1000)
mu.mem_write(CODE_START, opcode)
mu.reg_write(UC_RISCV_REG_A0, 0xffffffff80000000)
mu.reg_write(UC_RISCV_REG_PC, CODE_START)
mu.emu_start(CODE_START, CODE_START + len(opcode))
print(f"(UC) a0 = {hex(mu.reg_read(UC_RISCV_REG_A0))}")
def emu_triton():
# lui zero, 0x80000
# mv a0, zero
opcode = bytes.fromhex("37 00 00 80 13 05 00 00")
ctx = TritonContext()
ctx.setArchitecture(ARCH.RV64)
inst = Instruction()
inst.setOpcode(opcode)
inst.setAddress(CODE_START)
ctx.setConcreteRegisterValue(ctx.registers.x10, 0xffffffff80000000)
ctx.processing(inst)
print(f"(TT) a0 = {hex(ctx.getConcreteRegisterValue(ctx.registers.x10))}")
def main():
emu_unicorn()
emu_triton()
if __name__ == "__main__":
main()
Here is also the updated isa-test.py
that prints all the registers and showed me the problem:
import sys
from struct import pack
from triton import *
from riscv64_data import TESTS as RV64TESTS
# https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/master/riscv-cc.adoc#register-convention
REG_TO_ABI = {
"x0": "zero", # always zero
"x1": "ra", # return address
"x2": "sp", # stack pointer
"x3": "gp", # global pointer
"x4": "tp", # thread pointer
"x5": "t0", # temporary registers
"x6": "t1", # temporary registers
"x7": "t2", # temporary registers
"x8": "s0", # callee-saved registers
"x9": "s1", # callee-saved registers
"x10": "a0", # argument registers
"x11": "a1", # argument registers
"x12": "a2", # argument registers
"x13": "a3", # argument registers
"x14": "a4", # argument registers
"x15": "a5", # argument registers
"x16": "a6", # argument registers
"x17": "a7", # argument registers
"x18": "s2", # callee-saved registers
"x19": "s3", # callee-saved registers
"x20": "s4", # callee-saved registers
"x21": "s5", # callee-saved registers
"x22": "s6", # callee-saved registers
"x23": "s7", # callee-saved registers
"x24": "s8", # callee-saved registers
"x25": "s9", # callee-saved registers
"x26": "s10", # callee-saved registers
"x27": "s11", # callee-saved registers
"x28": "t3", # temporary registers
"x29": "t4", # temporary registers
"x30": "t5", # temporary registers
"x31": "t6", # temporary registers
}
assert len(REG_TO_ABI) == 32
ABI_TO_REG = {v: k for k, v in REG_TO_ABI.items()}
assert len(ABI_TO_REG) == 32
def emulate_test(name: str, binary: bytes, address: int, offset: int, trace: bool):
# initial state
STACK = 0x200000
istate = {
"stack": bytearray(b"".join([pack('B', 255 - i) for i in range(256)])),
"heap": bytearray(b"".join([pack('B', i) for i in range(256)])),
"x0": 0x0,
"x1": 0x0,
"x2": STACK,
"x3": 0x0,
"x4": 0x0,
"x5": 0x0,
"x6": 0x0,
"x7": 0x0,
"x8": 0x0,
"x9": 0x0,
"x10": 0x0,
"x11": 0x0,
"x12": 0x0,
"x13": 0x0,
"x14": 0x0,
"x15": 0x0,
"x16": 0x0,
"x17": 0x0,
"x18": 0x0,
"x19": 0x0,
"x20": 0x0,
"x21": 0x0,
"x22": 0x0,
"x23": 0x0,
"x24": 0x0,
"x25": 0x0,
"x26": 0x0,
"x27": 0x0,
"x28": 0x0,
"x29": 0x0,
"x30": 0x0,
"x31": 0x0,
"f0": 0x00112233445566778899aabbccddeeff,
"f1": 0xffeeddccbbaa99887766554433221100,
"f2": 0xfefedcdc5656787889892692dfeccaa0,
"f3": 0x1234567890987654321bcdffccddee01,
"f4": 0x0,
"f5": 0x0,
"f6": 0x0,
"f7": 0x0,
"f8": 0x0,
"f9": 0x0,
"f10": 0x0,
"f11": 0x0,
"f12": 0x0,
"f13": 0x0,
"f14": 0x0,
"f15": 0x0,
"f16": 0x0,
"f17": 0x0,
"f18": 0x0,
"f19": 0x0,
"f20": 0x0,
"f21": 0x0,
"f22": 0x0,
"f23": 0x0,
"f24": 0x0,
"f25": 0x0,
"f26": 0x0,
"f27": 0x0,
"f28": 0x0,
"f29": 0x0,
"f30": 0x0,
"f31": 0x0,
"pc": address,
}
ctx = TritonContext()
ctx.setArchitecture(ARCH.RV64)
ctx.setConcreteMemoryAreaValue(STACK, bytes(istate['stack']))
ctx.setConcreteMemoryAreaValue(address, binary[offset:])
for name, value in istate.items():
try:
reg = getattr(ctx.registers, name)
ctx.setConcreteRegisterValue(reg, value)
except AttributeError:
pass
pc = istate['pc']
for i in range(1000):
ctx.setConcreteRegisterValue(ctx.registers.pc, pc)
opcode = ctx.getConcreteMemoryValue(MemoryAccess(pc, CPUSIZE.DWORD))
opcode_bytes = pack('<I', opcode)
inst = Instruction(opcode_bytes)
inst.setAddress(pc)
state = ctx.processing(inst)
if trace:
disasm = inst.getDisassembly()
tokens = [token.rstrip(",") for token in disasm.split(" ")]
info = ""
for op in tokens[1:]:
if op in ABI_TO_REG:
op_abi = ABI_TO_REG[op]
reg = getattr(ctx.registers, op_abi)
value = ctx.getConcreteRegisterValue(reg)
if op == "zero" and value == 0:
continue
if len(info) > 0:
info += ", "
info += f"{op}/{op_abi}={hex(value)}"
print(f"{hex(inst.getAddress())}|{opcode_bytes.hex(' ')}|{disasm} ({info})")
if state == EXCEPTION.NO_FAULT:
pc = ctx.getConcreteRegisterValue(ctx.registers.pc)
else:
disasm = inst.getDisassembly()
if "fence" in disasm:
# HACK: ignore the unsupported fence instruction
pc += 4
elif "ecall" in disasm:
syscall_index = ctx.getConcreteRegisterValue(ctx.registers.x17)
#assert syscall_index == 139, f"invalid syscall: {syscall_index}"
return ctx.getConcreteRegisterValue(ctx.registers.x10)
else:
raise Exception(f"{inst} -> exception {state}")
return -1
if __name__ == "__main__":
success = 0
for name, binary, address, offset in RV64TESTS:
exit_code = emulate_test(name, binary, address, offset, trace=True)
if exit_code == 0:
print(f"SUCCESS: {name}")
success += 1
else:
print(f"FAILURE: {name}, {hex(exit_code)}")
print(f"\n{success}/{len(RV64TESTS)} passed")
Hi, @mrexodia I guess that's not really me who allows x0 field in output state to be modified And in Triton x0 register instance is set to be immutable
I guess here you wanted to process triton::arch::BasicBlock instead of single instruction 'inst'. And if a0 register was manually assigned to any except the "expected" value that one would be printed. Also you can print x0 value
def emu_triton():
# lui zero, 0x80000
# mv a0, zero
opcode = bytes.fromhex("37 00 00 80 13 05 00 00")
ctx = TritonContext()
ctx.setArchitecture(ARCH.RV64)
inst = Instruction()
inst.setOpcode(opcode)
inst.setAddress(CODE_START)
ctx.setConcreteRegisterValue(ctx.registers.x10, 0xffffffff80000000)
ctx.processing(inst)
print(f"(TT) a0 = {hex(ctx.getConcreteRegisterValue(ctx.registers.x10))}")
Sorry, I wasn't trying to assign blame to you specifically. Just want to help get the semantics correct 🙂
Thanks for pointing out I didn't do both instructions, I adjusted the tests but it looks like x0
is getting modified anyway:
from unicorn.riscv_const import *
from unicorn import *
from triton import *
CODE_START = 0x1000
def emu_unicorn():
# lui zero, 0x80000
# mv a0, zero
opcode = bytes.fromhex("37 00 00 80 13 05 00 00")
mu = Uc(UC_ARCH_RISCV, UC_MODE_RISCV64)
mu.mem_map(CODE_START, 0x1000)
mu.mem_write(CODE_START, opcode)
mu.reg_write(UC_RISCV_REG_PC, CODE_START)
mu.emu_start(CODE_START, CODE_START + len(opcode))
print(f"(TT) zero = {hex(ctx.getConcreteRegisterValue(ctx.registers.x0))}")
print(f"(UC) a0 = {hex(mu.reg_read(UC_RISCV_REG_A0))}")
def emu_triton():
ctx = TritonContext()
ctx.setArchitecture(ARCH.RV64)
# lui zero, 0x80000
inst1 = Instruction(bytes.fromhex("37 00 00 80"))
inst1.setAddress(CODE_START)
ctx.processing(inst1)
# mv a0, zero
inst2 = Instruction(bytes.fromhex("13 05 00 00"))
inst2.setAddress(CODE_START + 4)
ctx.processing(inst2)
print(f"(TT) a0 = {hex(ctx.getConcreteRegisterValue(ctx.registers.x10))}")
def main():
emu_unicorn()
emu_triton()
if __name__ == "__main__":
main()
Prints:
(UC) a0 = 0x0
(TT) zero = 0x0
(TT) a0 = 0xffffffff80000000
@Antwy the bug was a copy paste error here:
You defined MUTABLE = false for x0, but this was never passed to the triton::arch::Register (probably because it was copied from the x86 definition). After modifying this locally the ISA tests go to 64/65 are all successful!
Yeah, thanks for this one, but I still can reproduce non-zero x0 from your last example. The
print(f"(TT) x0 = {hex(ctx.getSymbolicRegisterValue(ctx.registers.x0))}")
gets me to
(TT) x0 = 0xffffffff80000000
There are two places where the MUTABLE
argument was missing, did you fix both of them? Locally I ran the following:
from unicorn.riscv_const import *
from unicorn import *
from triton import *
CODE_START = 0x1000
def emu_unicorn():
# lui zero, 0x80000
# mv a0, zero
opcode = bytes.fromhex("37 00 00 80 13 05 00 00")
mu = Uc(UC_ARCH_RISCV, UC_MODE_RISCV64)
mu.mem_map(CODE_START, 0x1000)
mu.mem_write(CODE_START, opcode)
mu.reg_write(UC_RISCV_REG_PC, CODE_START)
mu.emu_start(CODE_START, CODE_START + len(opcode))
print(f"(UC) x0 = {hex(mu.reg_read(UC_RISCV_REG_ZERO))}")
print(f"(UC) a0 = {hex(mu.reg_read(UC_RISCV_REG_A0))}")
def emu_triton():
ctx = TritonContext()
ctx.setArchitecture(ARCH.RV64)
# lui zero, 0x80000
inst1 = Instruction(bytes.fromhex("37 00 00 80"))
inst1.setAddress(CODE_START)
ctx.processing(inst1)
# mv a0, zero
inst2 = Instruction(bytes.fromhex("13 05 00 00"))
inst2.setAddress(CODE_START + 4)
ctx.processing(inst2)
print(f"(TT) x0 = {hex(ctx.getConcreteRegisterValue(ctx.registers.x0))}")
print(f"(TT) a0 = {hex(ctx.getConcreteRegisterValue(ctx.registers.x10))}")
def main():
emu_unicorn()
emu_triton()
if __name__ == "__main__":
main()
And it outputs the correct results:
(UC) x0 = 0x0
(UC) a0 = 0x0
(TT) x0 = 0x0
(TT) a0 = 0x0
I also had to rerun python setup.py install
to recreate triton.so
in my venv
Fixed. Thanks for your help, @mrexodia
Thanks a lot guys for this task force, it's nice to see.
@mrexodia, @Antwy, @m4drat everything is good on your side? Should I merge it or is it still a draft?
Thanks a lot guys for this task force, it's nice to see.
@mrexodia, @Antwy, @m4drat everything is good on your side? Should I merge it or is it still a draft?
Nice! If it looks good to you, I think it can be merged
Yeah all the official ISA tests are passing on my side! The only issue is AppVeyor, but I don't think it's related to the PR?
The only issue is AppVeyor, but I don't think it's related to the PR?
It looks like that issue is related to Capstone 5 on Windows. So what we can do is just to use Casptone 4.0 on Appveyor CI and then we add some #if CS_API_MAJOR >= 5
around riscv support on our side.
What do you think?
@Antwy
Found a problem with compressed slli
from unicorn.riscv_const import *
from unicorn import *
from triton import *
CODE_START = 0x0
def emu_unicorn():
# slli t6,t6,0x3c
opcode = b"\xf2\x1f"
mu = Uc(UC_ARCH_RISCV, UC_MODE_RISCV64)
mu.mem_map(CODE_START, 0x1000)
mu.mem_write(CODE_START, opcode)
mu.reg_write(UC_RISCV_REG_T6, 0x2107FF)
mu.reg_write(UC_RISCV_REG_PC, CODE_START)
mu.emu_start(CODE_START, CODE_START + len(opcode))
print("(UC) s0 = 0x%x" % mu.reg_read(UC_RISCV_REG_T6))
def emu_triton():
# slli t6,t6,0x3c
opcode = b"\xf2\x1f"
ctx = TritonContext()
ctx.setArchitecture(ARCH.RV64)
inst = Instruction()
inst.setOpcode(opcode)
inst.setAddress(CODE_START)
ctx.setConcreteRegisterValue(ctx.registers.x31, 0x2107FF)
ctx.processing(inst)
print("(TT) s0 = 0x%x" % ctx.getConcreteRegisterValue(ctx.registers.x31))
def main():
emu_unicorn()
emu_triton()
if __name__ == "__main__":
main()
@Antwy Found a problem with compressed
slli
Fixed.
Hi! Added basic support for RISCV instruction set. This covers most of IMC standard ISA extensions for RV32 & RV64. It would be great if you could give some review and merge it.
Some details: