Vector35 / binaryninja-api

Public API, examples, documentation and issues for Binary Ninja
https://binary.ninja/
MIT License
900 stars 204 forks source link

A64 instruction with pointer authentication (PAC) should not be lifted as intrinsics. #4638

Closed lwerdna closed 12 months ago

lwerdna commented 1 year ago

EDIT: The below table is obsolete! I originally thought each instruction would be permanently converted, and the table would track the progress. Instead, the architecture now has this configurable, controlled by a preprocessor macro. And the default setting is non-intrinsic for the instructions whose intrinsic lift degrades readability.

These are all the instructions associated with PAC, along with an example encoding to demonstrate. Most are lifted as intrinsics and we need to move away from this and towards just the normal (non-authenticated) llil description of what the instructions do.

status encoding disassembly encoding LLIL
DAC11A56 autda x22, x18 AUTDA_64P_dp_1src LLIL_INTRINSIC([x22],__autda,[LLIL_REG.q(x22),LLIL_REG.q(x18)])
DAC13BE1 autdza x1 AUTDZA_64Z_dp_1src LLIL_INTRINSIC([x1],__autda,[LLIL_REG.q(x1),LLIL_CONST.q(0x0)])
DAC11C73 autdb x19, x3 AUTDB_64P_dp_1src LLIL_INTRINSIC([x19],__autdb,[LLIL_REG.q(x19),LLIL_REG.q(x3)])
DAC13FE1 autdzb x1 AUTDZB_64Z_dp_1src LLIL_INTRINSIC([x1],__autdb,[LLIL_REG.q(x1),LLIL_CONST.q(0x0)])
DAC1113A autia x26, x9 AUTIA_64P_dp_1src LLIL_INTRINSIC([x26],__autia,[LLIL_REG.q(x26),LLIL_REG.q(x9)])
DAC133FE autiza x30 AUTIZA_64Z_dp_1src LLIL_INTRINSIC([x30],__autia,[LLIL_REG.q(x30),LLIL_CONST.q(0x0)])
DAC116B2 autib x18, x21 AUTIB_64P_dp_1src LLIL_INTRINSIC([x18],__autib,[LLIL_REG.q(x18),LLIL_REG.q(x21)])
DAC137F1 autizb x17 AUTIZB_64Z_dp_1src LLIL_INTRINSIC([x17],__autib,[LLIL_REG.q(x17),LLIL_CONST.q(0x0)])
:heavy_check_mark: D63F0180 blr x12 BLR_64_branch_reg LLIL_CALL(LLIL_REG.q(x12))
:heavy_check_mark: D63F08FF blraaz x7 BLRAAZ_64_branch_reg LLIL_CALL(LLIL_REG.q(x7))
:heavy_check_mark: D73F0BFF blraa xzr, sp BLRAA_64P_branch_reg LLIL_CALL(LLIL_CONST.q(0x0))
:heavy_check_mark: D63F0DBF blrabz x13 BLRABZ_64_branch_reg LLIL_CALL(LLIL_REG.q(x13))
:heavy_check_mark: D73F0C9F blrab x4, sp BLRAB_64P_branch_reg LLIL_CALL(LLIL_REG.q(x4))
:heavy_check_mark: D61F02A0 br x21 BR_64_branch_reg LLIL_JUMP(LLIL_REG.q(x21))
:heavy_check_mark: D61F08FF braaz x7 BRAAZ_64_branch_reg LLIL_JUMP(LLIL_REG.q(x7))
:heavy_check_mark: D71F0B25 braa x25, x5 BRAA_64P_branch_reg LLIL_JUMP(LLIL_REG.q(x25))
:heavy_check_mark: D61F0CDF brabz x6 BRABZ_64_branch_reg LLIL_JUMP(LLIL_REG.q(x6))
:heavy_check_mark: D71F0EF1 brab x23, x17 BRAB_64P_branch_reg LLIL_JUMP(LLIL_REG.q(x23))
D69F03E0 eret ERET_64E_branch_reg LLIL_INTRINSIC([],_eret,[]); LLIL_TRAP(0)
D69F0BFF eretaa ERETAA_64E_branch_reg LLIL_INTRINSIC([],_eret,[]); LLIL_TRAP(0)
D69F0FFF eretab ERETAB_64E_branch_reg LLIL_INTRINSIC([],_eret,[]); LLIL_TRAP(0)
:heavy_check_mark: F872A7C7 ldraa x7, [x30, #-0x6b0] LDRAA_64_ldst_pac LLIL_SET_REG.q(x7,LLIL_LOAD.q(LLIL_ADD.q(LLIL_REG.q(x30),LLIL_CONST.q(0xFFFFFFFFFFFFF950))))
:heavy_check_mark: F861CFE7 ldraa x7, [sp, #-0xf20]! LDRAA_64W_ldst_pac LLIL_SET_REG.q(sp,LLIL_ADD.q(LLIL_REG.q(sp),LLIL_CONST.q(0xFFFFFFFFFFFFF0E0))); LLIL_SET_REG.q(x7,LLIL_LOAD.q(LLIL_REG.q(sp)))
:heavy_check_mark: F8B1A63B ldrab x27, [x17, #0x8d0] LDRAB_64_ldst_pac LLIL_SET_REG.q(x27,LLIL_LOAD.q(LLIL_ADD.q(LLIL_REG.q(x17),LLIL_CONST.q(0x8D0))))
:heavy_check_mark: F8B59C34 ldrab x20, [x1, #0xac8]! LDRAB_64W_ldst_pac LLIL_SET_REG.q(x1,LLIL_ADD.q(LLIL_REG.q(x1),LLIL_CONST.q(0xAC8))); LLIL_SET_REG.q(x20,LLIL_LOAD.q(LLIL_REG.q(x1)))
DAC10AA9 pacda x9, x21 PACDA_64P_dp_1src LLIL_INTRINSIC([x9],__pacda,[LLIL_REG.q(x9),LLIL_REG.q(x21)])
DAC12BE5 pacdza x5 PACDZA_64Z_dp_1src LLIL_INTRINSIC([x5],__pacda,[LLIL_REG.q(x5),LLIL_CONST.q(0x0)])
DAC10C6E pacdb x14, x3 PACDB_64P_dp_1src LLIL_INTRINSIC([x14],__pacdb,[LLIL_REG.q(x14),LLIL_REG.q(x3)])
DAC12FE1 pacdzb x1 PACDZB_64Z_dp_1src LLIL_INTRINSIC([x1],__pacdb,[LLIL_REG.q(x1),LLIL_CONST.q(0x0)])
9ACC33E1 pacga x1, xzr, x12 PACGA_64P_dp_2src LLIL_INTRINSIC([x1],__pacga,[LLIL_CONST.q(0x0),LLIL_REG.q(x12)])
DAC101C6 pacia x6, x14 PACIA_64P_dp_1src LLIL_INTRINSIC([x6],__pacia,[LLIL_REG.q(x6),LLIL_REG.q(x14)])
DAC123F5 paciza x21 PACIZA_64Z_dp_1src LLIL_INTRINSIC([x21],__pacia,[LLIL_REG.q(x21),LLIL_CONST.q(0x0)])
DAC1049D pacib x29, x4 PACIB_64P_dp_1src LLIL_INTRINSIC([x29],__pacib,[LLIL_REG.q(x29),LLIL_REG.q(x4)])
DAC127EE pacizb x14 PACIZB_64Z_dp_1src LLIL_INTRINSIC([x14],__pacib,[LLIL_REG.q(x14),LLIL_CONST.q(0x0)])
:heavy_check_mark: D65F0360 ret x27 RET_64R_branch_reg LLIL_RET(LLIL_REG.q(x27))
:heavy_check_mark: D65F0BFF retaa RETAA_64E_branch_reg LLIL_RET(LLIL_REG.q(x30))
:heavy_check_mark: D65F0FFF retab RETAB_64E_branch_reg LLIL_RET(LLIL_REG.q(x30))
DAC147FF xpacd xzr XPACD_64Z_dp_1src LLIL_INTRINSIC([xzr],__xpacd,[LLIL_CONST.q(0x0)])
DAC143F9 xpaci x25 XPACI_64Z_dp_1src LLIL_INTRINSIC([x25],__xpaci,[LLIL_REG.q(x25)])

A convenient list of these encodings is:

[0xDAC11A56, 0xDAC13BE1, 0xDAC11C73, 0xDAC13FE1, 0xDAC1113A, 0xDAC133FE, 0xDAC116B2, 0xDAC137F1, 0xD63F0180, 0xD63F08FF, 0xD73F0BFF, 0xD63F0DBF, 0xD73F0C9F, 0xD61F02A0, 0xD61F08FF, 0xD71F0B25, 0xD61F0CDF, 0xD71F0EF1, 0xD69F03E0, 0xD69F0BFF, 0xD69F0FFF, 0xF872A7C7, 0xF861CFE7, 0xF8B1A63B, 0xF8B59C34, 0xDAC10AA9, 0xDAC12BE5, 0xDAC10C6E, 0xDAC12FE1, 0x9ACC33E1, 0xDAC101C6, 0xDAC123F5, 0xDAC1049D, 0xDAC127EE, 0xD65F0360, 0xD65F0BFF, 0xD65F0FFF, 0xDAC147FF, 0xDAC143F9]

I will try to mark the status column with checkboxes as progress is made.

lwerdna commented 1 year ago

Related issue that envisions an authenticated function attribute: https://github.com/Vector35/binaryninja-api/issues/3856 Related issue: https://github.com/Vector35/binaryninja-api/issues/3997

lwerdna commented 12 months ago

Commit: https://github.com/Vector35/arch-arm64/commit/3847e619459195db64b5a1ea14a445dbe1c4e8ce

Instructions that sign pointers (PAC.*), authenticate pointers (AUT.*) and strip pointers (XPAC.*) now, by default, will lift to nop, so not touching the pointer.

The instructions that were like "old instruction plus authentication" like BLRA.*, .LDRA.*, and RETA.* just lift to "old instruction".

Here's a function using pacibsp (add PAC to lr using key B and context/modifier sp) and corresponding retab (return to lr after authenticating with key B). Left to right it's Disassembly, Low Level IL, Pseudo C:

before:

image

after:

image