Improve ZydisDisassembledInstruction

zyantific / zydis

Fast and lightweight x86/x86-64 disassembler and code generation library

https://zydis.re

MIT License

3.47k stars 438 forks source link

Improve ZydisDisassembledInstruction #504

Closed alessandromrc closed 6 months ago

alessandromrc commented 6 months ago

It would be cool if ZydisDisassembledInstruction could also tell if the instruction is a jump one, this would help on exception handlers to continue a forced execution by understanding the context and hopefully continue the execution... I am also sure this can be used to create disassemblers as you can split functions more easily.

This should count-in instructions like "JMP, JNZ, JLE" and all the other conditional jumps that there are.

alessandromrc commented 6 months ago

Doubt this is the best way to do it but this is the idea:

  std::vector<ZydisMnemonic> JMP_List{
    ZYDIS_MNEMONIC_JB,
    ZYDIS_MNEMONIC_JBE,
    ZYDIS_MNEMONIC_JCXZ,
    ZYDIS_MNEMONIC_JECXZ,
    ZYDIS_MNEMONIC_JKNZD,
    ZYDIS_MNEMONIC_JKZD,
    ZYDIS_MNEMONIC_JL,
    ZYDIS_MNEMONIC_JLE,
    ZYDIS_MNEMONIC_JMP,
    ZYDIS_MNEMONIC_JNB,
    ZYDIS_MNEMONIC_JNBE,
    ZYDIS_MNEMONIC_JNL,
    ZYDIS_MNEMONIC_JNLE,
    ZYDIS_MNEMONIC_JNO,
    ZYDIS_MNEMONIC_JNP,
    ZYDIS_MNEMONIC_JNS,
    ZYDIS_MNEMONIC_JNZ,
    ZYDIS_MNEMONIC_JO,
    ZYDIS_MNEMONIC_JP,
    ZYDIS_MNEMONIC_JRCXZ,
    ZYDIS_MNEMONIC_JS,
    ZYDIS_MNEMONIC_JZ,
  };

  bool isJumpMnemonic(ZydisMnemonic mnemonic) {
    return std::find(JMP_List.begin(), JMP_List.end(), mnemonic) != JMP_List.end();
  }

athre0z commented 6 months ago

Does instr.meta.category and the corresponding ZydisInstructionCategory provide what you are looking for? The two categories for jumps should be ZYDIS_CATEGORY_UNCOND_BR and ZYDIS_CATEGORY_COND_BR.

Alternatively you could also inspect meta.branch_type, although that will also be set for call instructions.

alessandromrc commented 6 months ago

Does instr.meta.category and the corresponding ZydisInstructionCategory provide what you are looking for? The two categories for jumps should be ZYDIS_CATEGORY_UNCOND_BR and ZYDIS_CATEGORY_COND_BR.

Alternatively you could also inspect meta.branch_type, although that will also be set for call instructions.

Just tried it and it seems to not work, tells MOV and other instructions are Jumps so I doubt that's the correct way of handling it.

athre0z commented 6 months ago

Did you try the first or the last variant that I suggested? At least the former should definitely work. The latter should also not trigger for a mov, though admittedly I've personally never really use that field, so there could be a bug.

For the category:

$ jq -r '.[] | select(.meta_info.category == "UNCOND_BR" or .meta_info.category == "COND_BR") | .mnemonic' instructions.json | sort | uniq
jb
jbe
jcxz
jecxz
jknzd
jkzd
jl
jle
jmp
jnb
jnbe
jnl
jnle
jno
jnp
jns
jnz
jo
jp
jrcxz
js
jz
loop
loope
loopne
xabort
xbegin
xend

flobernd commented 6 months ago

@alessandromrc

Just tried it and it seems to not work, tells MOV and other instructions are Jumps so I doubt that's the correct way of handling it.

Are you sure you looked at the correct instruction? meta.branch_type works like intended according to my tests. Your way of doing it, is not horrible as well 🙂 When I was working on the hook library a long time ago, I did the same. However, it depends on the context. In the hook library I have to know details about specific intructions anyways, but in other cases I prefer to be more generic. Using the category makes sure you are always on the safe side, even if new branch instructions get added in the future.

alessandromrc commented 6 months ago

@alessandromrc

Just tried it and it seems to not work, tells MOV and other instructions are Jumps so I doubt that's the correct way of handling it.

Are you sure you looked at the correct instruction? meta.branch_type works like intended according to my tests. Your way of doing it, is not horrible as well 🙂 When I was working on the hook library a long time ago, I did the same. However, it depends on the context. In the hook library I have to know details about specific intructions anyways, but in other cases I prefer to be more generic. Using the category makes sure you are always on the safe side, even if new branch instructions get added in the future.

In my case it would become handy to actually know the context of the instructions so I think the implementation you used in the hook is fairly robust... Will probably rely on something like it.

Will close the Issue since this is now solved, Thanks @athre0z and @flobernd!