[Question] Generating an executable that contains every valid x86 instruction

recvfrom commented 6 years ago

It'd be great to have an executable that contains every valid, non-privileged x86 instruction for testing things like binary disassembly engines. Even better, it'd be great if this executable could be run successfully on any given machine (for example, have it check the CPU capabilities and only execute instructions that are supported by the system executing the program) for testing dynamic execution engines or CPU emulators. I'm sure Intel already has programs like this for internal testing, but I couldn't find anything publicly available, so I'd like to explore generating one (in an automated fashion if possible, seems it seems like there are a lot of valid instruction encodings.)

Intel XED seems like it would be a great tool to leverage, since it understands all the instructions, supported operand types, and CPUID relationships. I'm not familiar with the codebase at all, though, so I was wondering if you had any recommendations on how to get started with something like this. For instance, would it be better to use the provided C API, or could the Python data file parsing and related functions be leveraged? If the latter, is there any example Python code that would be a good starting place? Are there any potential challenges that you foresee?

Thank you!

markcharney commented 6 years ago

Wow, very big question. :-)

I don't have anything like that directly available to me (a test with all instructions). I work with a lot of internal validation teams and they (thankfully) have all sorts of stuff. The space of valid instructions is quite large if you factor in all valid register combinations. And even if you were not interested in all trivial variations of the GPRs, then there are weird memory addressing cases which can often expose things or break assumptions. (like no-base or no-index SIB addressing). Another area is ignored or redundant legacy prefixes. There are typically two different displacement sizes to factor in for memops. Generating tests with branches adds another layer of complexity because you have to branch somewhere known. Similar for that that modify the stack or the x87 register stack. Depending on how far you want to go, it could be a big project.

If I were to start on something like that, i would start by reading the XED "intermediate" database (after the generation step) with the read_xed_db.py script (lot of examples of that) but a lot of work would be required to convert the patterns in to something executable.

Another approach I have seen some people use is to parse the iforms and try to then generate data structures they pass to the XED encoder. The iforms are not perfect for this and there are many special cases that are not clearly elucidated (or accessible) by taking just that approach.

If it could be avoided, I might consider generating and executing the generated bytes directly in memory rather than generating an executable since formats differ by O/S. Someone recently showed me this cool technique for doing an in-python JIT. https://github.com/flababah/cpuid.py and I thought that would be handy if I ever needed to do something like this in a portable way.

I imagine a lot of people have thought about this...

recvfrom commented 6 years ago

Thank you so much for the quick, thorough response - it was a big help! I've started work on a proof of concept using the intermediate database parsed by read_xed_db.py, and so far it seems like I'm making good progress (although I'm currently only looking at the "easy" instructions!)

Would you mind if I kept this issue open in case I have related questions in the future, or is there a better avenue to leverage for questions?

Thanks again!

recvfrom commented 6 years ago

I decided to take a step back and have been working with the ex1 program for now. I had two questions from that:

If an instruction has a REX prefix with no bits set (so, 0x40, which effectively does nothing) what's the best way to determine that given a xed_decoded_inst_t?
There are some instructions (see below) for which calling 'xed_operand_width_enum_t2str(xed_operand_width(op))' returns 'INVALID'... It seems like they should have a valid width, so I was curious why this is the case:

$ ./obj/examples/xed-ex1 -64 0F34
Attempting to decode: 0f 34
iclass SYSENTER    category SYSCALL    ISA-extension BASE    ISA-set PPRO
instruction-length 2
operand-width 32
effective-operand-width 32
effective-address-width 64
stack-address-width 64
iform-enum-name SYSENTER
iform-enum-name-dispatch (zero based) 0
iclass-max-iform-dispatch 1
Operands
#   TYPE            DETAILS     VIS  RW     OC2 BITS BYTES NELEM ELEMSZ   ELEMTYPE   REGCLASS
#   ====            =======     ===  ==     === ==== ===== ===== ======   ========   ========
0   REG0            REG0=RIP SUPPRESSED   W         Q   64  8   1   64      INT         IP
1   REG1            REG1=RSP SUPPRESSED   W   INVALID   64  8   1   64      INT     GPR
2   REG2        REG2=RFLAGS SUPPRESSED   W      Y   32  4   1   32      INT     FLAGS
Memory Operands
  MemopBytes = 0
FLAGS:
  must-write-rflags vm-0 rf-0 if-0
    read:                               mask=0x0
    written:                    if vm rf  mask=0x30200
  undefined:                                mask=0x0
ATTRIBUTES: NOTSX PROTECTED_MODE
ISA SET: [PPRO]

$ ./obj/examples/xed-ex1 -64 8E00
Attempting to decode: 8e 00
iclass MOV    category DATAXFER    ISA-extension BASE    ISA-set I86
instruction-length 2
operand-width 32
effective-operand-width 32
effective-address-width 64
stack-address-width 64
iform-enum-name MOV_SEG_MEMw
iform-enum-name-dispatch (zero based) 21
iclass-max-iform-dispatch 22
Operands
#   TYPE            DETAILS     VIS  RW     OC2 BITS BYTES NELEM ELEMSZ   ELEMTYPE   REGCLASS
#   ====            =======     ===  ==     === ==== ===== ===== ======   ========   ========
0   REG0            REG0=ES   EXPLICIT   W   INVALID   16   2   1   16      INT         SR
1   MEM0        (see below)   EXPLICIT   R      W   16  2   1   16      INT INVALID
Memory Operands
  0 read BASE= RAX/GPR  ASZ0=64
  MemopBytes = 2
ATTRIBUTES: NOTSX
ISA SET: [I86]

Thank you!

markcharney commented 6 years ago

1) xed3_operand_get_rex() will tell you if there is a REX prefix (regardless of any of the 4 payload bits)

2) My thinking was that for things that resolve to registers one could just look at the register width for the specific register so I didn't need the oc2 code (which is why it shows up as invalid). xed_decoded_inst_operand_length_bits() does that and returns the length in bits.

pgoodman commented 6 years ago

@recvfrom If you could generate the instructions, then you might be able to reasonably test them with microx :-D

recvfrom commented 6 years ago

@pgoodman thanks for the suggestion, I'll take a look at microx!

@markcharney I had a question about some of the patterns that leverage the IMMUNE66() tag. Specifically, the following: (the format is "%s %s" % (iclass, pattern))

ADCX 0x0F 0x38 0xF6   MOD[mm] MOD!=3 REG[rrr] RM[nnn] MODRM() REP=0 OSZ=1 REXW=0 SKIP_OSZ=1  IMMUNE66()
ADCX 0x0F 0x38 0xF6  MOD[0b11] MOD=3 REG[rrr] RM[nnn] REP=0 OSZ=1  REXW=1 SKIP_OSZ=1 IMMUNE66()
ADCX 0x0F 0x38 0xF6  MOD[0b11] MOD=3 REG[rrr] RM[nnn] REP=0 OSZ=1 REXW=0 SKIP_OSZ=1  IMMUNE66()
ADCX 0x0F 0x38 0xF6  MOD[mm] MOD!=3 REG[rrr] RM[nnn]  MODRM() REP=0 OSZ=1  REXW=1 SKIP_OSZ=1  IMMUNE66()
PCMPESTRI 0x0F 0x3A 0x61 REP=0 OSZ=1 IMMUNE66() REXW=0 SKIP_OSZ=1 MOD[0b11] MOD=3 REG[rrr] RM[nnn] UIMM8()
PCMPESTRI 0x0F 0x3A 0x61 REP=0 OSZ=1 IMMUNE66() REXW=0 SKIP_OSZ=1 MOD[mm] MOD!=3 REG[rrr] RM[nnn] MODRM() UIMM8()
PCMPESTRI 0x0F 0x3A 0x61 REP=0 OSZ=1 IMMUNE66() REXW=1 SKIP_OSZ=1 MOD[0b11] MOD=3 REG[rrr] RM[nnn] UIMM8()
PCMPESTRI 0x0F 0x3A 0x61 REP=0 OSZ=1 IMMUNE66() REXW=1 SKIP_OSZ=1 MOD[mm] MOD!=3 REG[rrr] RM[nnn] MODRM() UIMM8()
PCMPESTRM 0x0F 0x3A 0x60 REP=0 OSZ=1 IMMUNE66() REXW=0 SKIP_OSZ=1 MOD[0b11] MOD=3 REG[rrr] RM[nnn] UIMM8()
PCMPESTRM 0x0F 0x3A 0x60 REP=0 OSZ=1 IMMUNE66() REXW=0 SKIP_OSZ=1 MOD[mm] MOD!=3 REG[rrr] RM[nnn] MODRM() UIMM8()
PCMPESTRM 0x0F 0x3A 0x60 REP=0 OSZ=1 IMMUNE66() REXW=1 SKIP_OSZ=1 MOD[0b11] MOD=3 REG[rrr] RM[nnn] UIMM8()
PCMPESTRM 0x0F 0x3A 0x60 REP=0 OSZ=1 IMMUNE66() REXW=1 SKIP_OSZ=1 MOD[mm] MOD!=3 REG[rrr] RM[nnn] MODRM() UIMM8()
PCMPISTRI 0x0F 0x3A 0x63 REP=0 OSZ=1 IMMUNE66() REXW=0 SKIP_OSZ=1 MOD[0b11] MOD=3 REG[rrr] RM[nnn] UIMM8()
PCMPISTRI 0x0F 0x3A 0x63 REP=0 OSZ=1 IMMUNE66() REXW=0 SKIP_OSZ=1 MOD[mm] MOD!=3 REG[rrr] RM[nnn] MODRM() UIMM8()
PCMPISTRI 0x0F 0x3A 0x63 REP=0 OSZ=1 IMMUNE66() REXW=1 SKIP_OSZ=1 MOD[0b11] MOD=3 REG[rrr] RM[nnn] UIMM8()
PCMPISTRI 0x0F 0x3A 0x63 REP=0 OSZ=1 IMMUNE66() REXW=1 SKIP_OSZ=1 MOD[mm] MOD!=3 REG[rrr] RM[nnn] MODRM() UIMM8()
PCMPISTRM 0x0F 0x3A 0x62 REP=0 OSZ=1 IMMUNE66() MOD[0b11] MOD=3 REG[rrr] RM[nnn] UIMM8()
PCMPISTRM 0x0F 0x3A 0x62 REP=0 OSZ=1 IMMUNE66() MOD[mm] MOD!=3 REG[rrr] RM[nnn] MODRM() UIMM8()

From what I can gather, IMMUNE66() indicates that no 0x66 prefix should be generated for these instructions in 32b mode... From the Sept. 2016 Intel manual, though, it seems like these instructions have 0x66 built-in to their op codes, so REFINING66() should be used instead. Am I misunderstanding the usage of IMMUNE66() here? Thanks!

markcharney commented 6 years ago

IMMUNE66() is really for the effective operand size calculation. Some instructions with 66 prefixes don't respond in the conventional way to the 66 prefix. This is very true for SIMD instructions where the EOSZ value is less relevant. But for instructions like these that use GPRs, IMMUNE66() how I factor the 66 prefix out of the normal EOSZ computation.

P.S. always used latest SDM; Updates come out roughly quarterly.

markcharney commented 6 years ago

and...

REFINING66 is for things that have a 16b EOSZ in 16b protected mode. These don't. I think that was part of your question.

recvfrom commented 6 years ago

Oh ok, that makes sense... Is the correct way to know whether an instruction requires the 0x66 prefix to actually look for OSZ=1, then (in addition to looking for instructions with EOSZ=1 when outputting a program that will run in 32-bit mode)?

markcharney commented 6 years ago

Generally yes, OSZ=1 means a 66 is required. Usually the OSZ=1 is buried in one of the macros ("osz_refining_prefix').

EOSZ=1 is for 16b operations. Not sure why you menteiond that.

recvfrom commented 6 years ago

Oh, I thought if an instruction had EOSZ=1 in the pattern it was an instruction for which 0x66 could be added to leverage 16b operands in 32b mode. That assumption seems to have worked out fine, although there aren't that many patterns with the tag. Is there a better way to know whether an instruction can optionally have the 0x66 prefix and it actually has an effect on the instruction?

andreas-abel commented 5 years ago

@recvfrom It might not be exactly what you are looking for, but you might also find my project https://github.com/andreas-abel/XED-to-XML helpful.

intelxed / xed

[Question] Generating an executable that contains every valid x86 instruction #134