BinaryAnalysisPlatform / bap

Binary Analysis Platform
MIT License
2.05k stars 271 forks source link

RISCV PLT call causes subsequent instructions to be lost. #1606

Open matt-j-griffin opened 3 months ago

matt-j-griffin commented 3 months ago

I've been using BAP to analyze cURL in RISC-V (libcurl.4.4.0).

Calling llvm-objdump on the binary results in this dump.

Generating BIL for the same binary using bap libcurl.4.4.0 -dbil.adt produces this file.

In the BIL output, after the instruction jal appears in a subroutine all the subsequent instructions are lost. In these cases, jal is used to call PLT stubs in the binary.

An example can be found in the curl_easy_getinfo subroutine given below:

000000000001be2c <curl_easy_getinfo>:
   1be2c: 5d 71         addi    sp, sp, -0x50
   1be2e: 3e fc         sd  a5, 0x38(sp)
   1be30: 1c 10         addi    a5, sp, 0x20
   1be32: 06 ec         sd  ra, 0x18(sp)
   1be34: 32 f0         sd  a2, 0x20(sp)
   1be36: 36 f4         sd  a3, 0x28(sp)
   1be38: 3a f8         sd  a4, 0x30(sp)
   1be3a: c2 e0         sd  a6, 0x40(sp)
   1be3c: c6 e4         sd  a7, 0x48(sp)
   1be3e: 3e e4         sd  a5, 0x8(sp)
   1be40: ef e0 ef ab   jal 0x1a0fe <Curl_getinfo>
   1be44: e2 60         ld  ra, 0x18(sp)
   1be46: 61 61         addi    sp, sp, 0x50
   1be48: 82 80         ret

The BIL for this subroutine is as follows:

1be2c: <curl_easy_getinfo>
1be2c:
1be2c: addi sp, sp, -0x50
(Move(Var("X2",Imm(64)),PLUS(Var("X2",Imm(64)),Int(18446744073709551536,64))))
1be2e: sd a5, 0x38(sp)
(Move(Var("mem",Mem(64,8)),Store(Var("mem",Mem(64,8)),PLUS(Var("X2",Imm(64)),Int(56,64)),Var("X15",Imm(64)),LittleEndian(),64)))
1be30: addi a5, sp, 0x20
(Move(Var("X15",Imm(64)),PLUS(Var("X2",Imm(64)),Int(32,64))))
1be32: sd ra, 0x18(sp)
(Move(Var("mem",Mem(64,8)),Store(Var("mem",Mem(64,8)),PLUS(Var("X2",Imm(64)),Int(24,64)),Var("X1",Imm(64)),LittleEndian(),64)))
1be34: sd a2, 0x20(sp)
(Move(Var("mem",Mem(64,8)),Store(Var("mem",Mem(64,8)),PLUS(Var("X2",Imm(64)),Int(32,64)),Var("X12",Imm(64)),LittleEndian(),64)))
1be36: sd a3, 0x28(sp)
(Move(Var("mem",Mem(64,8)),Store(Var("mem",Mem(64,8)),PLUS(Var("X2",Imm(64)),Int(40,64)),Var("X13",Imm(64)),LittleEndian(),64)))
1be38: sd a4, 0x30(sp)
(Move(Var("mem",Mem(64,8)),Store(Var("mem",Mem(64,8)),PLUS(Var("X2",Imm(64)),Int(48,64)),Var("X14",Imm(64)),LittleEndian(),64)))
1be3a: sd a6, 0x40(sp)
(Move(Var("mem",Mem(64,8)),Store(Var("mem",Mem(64,8)),PLUS(Var("X2",Imm(64)),Int(64,64)),Var("X16",Imm(64)),LittleEndian(),64)))
1be3c: sd a7, 0x48(sp)
(Move(Var("mem",Mem(64,8)),Store(Var("mem",Mem(64,8)),PLUS(Var("X2",Imm(64)),Int(72,64)),Var("X17",Imm(64)),LittleEndian(),64)))
1be3e: sd a5, 0x8(sp)
(Move(Var("mem",Mem(64,8)),Store(Var("mem",Mem(64,8)),PLUS(Var("X2",Imm(64)),Int(8,64)),Var("X15",Imm(64)),LittleEndian(),64)))
1be40: jal -0x1d42
(Move(Var("X1",Imm(64)),Int(114244,64)), Jmp(Int(106750,64)))

Instructions at 1be44, 1be46 and 1be48 do not appear in the BIL output.

Is there a workaround?