iximeow / yaxpeax-x86

x86 decoders for the yaxpeax project
BSD Zero Clause License
129 stars 23 forks source link

Commonize x86 `Opcode` and `Operand` downwards across the three processor modes #21

Open DrChat opened 2 years ago

DrChat commented 2 years ago

This is a follow-up to #19.

These enums and structures are mostly identical across all three processor modes, and it is useful to combine these for writing code that is generic to all three modes. In order to access these common fields, a new trait X86Instruction (open for naming suggestions) has been added to provide access to these fields.

The trait is kind of janky to use as of now: you must declare the bound with a where clause:

    where
        <A as Arch>::Instruction: X86Instruction,
iximeow commented 2 years ago

hmm, i was thinking of a bit of a different approach: as you've noted, Operand and Opcode generally are very similar, so i think we could actually replace explicitly writing out the Opcode enum with a bit of codegen that also handles laying out the string arrays over in */display.rs. that, i think, would also let us assign explicit values to the Opcode variants so yaxpeax-x86 could have data structures laid out like...

const MNEMONICS: &'[&'static str] = &[
    "add",
    "sub",
    "aaa", // only 16- and 32-bit `Opcode` reference this
    "aas", // only 16- and 32-bit `Opcode` reference this
    "movsx", // only 64-bit `Opcode` references this
    "mov", // used in all modes
];

mod real_mode {
    enum Opcode {
        ADD = 0,
        SUB = 1,
        AAA = 2,
        AAS = 3,
        MOV = 5,
        ...
    }
}

mod long_mode {
    enum Opcode {
        ADD = 0,
        SUB = 1,
        MOVSX = 4,
        MOV = 5,
        ...
    }
}

mod quasi_x86_name_pending {
    // note that _this_ `Opcode` has the same integer values for each variant, so a conversion to this opcode can be just a transmute
    enum Opcode {
        ADD = 0,
        SUB = 1,
        AAA = 2,
        AAS = 3,
        MOVSX = 4,
        MOV = 5,
        ...
    }
}

where this could get generated from a table like

add=all,
sub=all,
aaa=16,32
aas=16,32
movsx=64
mov=all,
...

spitballing, i really haven't thought about the table layout in particular. this could let us generate the Colorize impl too, which is just kinda gross to maintain.

this is trickier for Operand since i don't think we can guarantee layout-compatibility for those. but with something automated handling Opcode and a bit of elbow grease around Operand, i think then we could have a

mod quasi_x86_name_pending {
    /// an "arch" for a pseudo-x86 - a best-effort superset of 16-, 32-, and 64-bit x86
    pub struct Arch;

    impl yaxpeax_arch::Arch for Arch {
        // same idea as other modes, but with the superset versions of `Opcode` and `Operand`
    }

    struct SupersetDecoderNamePending {
        x86_16: yaxpeax_x86::real_mode::InstDecoder,
        x86_32: yaxpeax_x86::protected_mode::InstDecoder,
        x86_64: yaxpeax_x86::long_mode::InstDecoder,
        current_mode: EnumToSelectWhichDecoder
    }

    impl Decoder<Arch> for SupersetDecoderNamePending {
        fn decode<...>(&self, words: ...) -> Result<Instruction, DecodeError> {
            match self.current_mode {
                x86_16 => self.x86_16.decode(words).map(|inst| inst.into_superset_form())
                ...
            }
        }
    }
}

this would require functions to transform an arch-specific instruction into the common-x86 form, but that fills almost the same niche as your X86Instruction trait (though a bit differently). i think it fits the same for uses like yours, but avoids kinda-shared kinda-distinct data for uses where distinguishing between modes is more desired?

i509VCB commented 2 years ago

If we are going to do some codegen, I'd highly recommend using a standard format like json (if possible) so others can use that data as well.

DrChat commented 2 years ago

Hmm - such a table that links Opcode to the MNEMONICS array could likely be generated with a procedural macro. It'd be somewhat difficult to generate the mode-specific tables without resorting to a custom solution (such as a build.rs script or a proc-macro crate).

I do like the idea of having all Opcode enums be represented by an integral value that is the same for every unique instruction.

Let me think some more and see if I can't work my way towards what you've suggested.

DrChat commented 2 years ago

Just to write this idea down to save for later - we could probably implement Opcode subsets with a special proc-macro:

#[repr(usize)] // Required to define the layout
enum Opcode {
  ADD,
  AAA,
  AAS,
  SUB,
  MOV,
  MOVSX,
}

mod long_mode {
  #[superset="super::Opcode"]
  #[repr(usize)] // Required to define the layout
  enum Opcode {
    ADD,
    SUB,
    MOV,
    MOVSX,
    INC, // ERROR: Enum variant is not specified in superset enum (as an example)
  }

  // Generated by proc-macro
  impl Opcode {
    pub fn to_superset(&self) -> super::Opcode {
      // SAFETY: Guaranteed to be safe, as superset implements all variants of this subset.
      unsafe { core::mem::transmute(self) }
    }

    pub fn from_superset(enum: super::Opcode) -> Option<Self> {
      todo!()
    }
  }
}

Such a macro would define subset variants to be equivalent to their superset variants (for 1:1 conversion or direct casting in the case of going from a subset to a superset).

iximeow commented 2 years ago

ah! i was wondering if you'd made progress on this or put it aside. is there already a proc macro for superset or would you have to write that too? the tricky thing here is that if there are holes in the subset enums you'd need to either pick matching underlying values for all variants or make to_superset a bit more complicated (otherwise the transmute might map f.ex long_mode::Opcode::INC to ::Opcode::MOV! no good)

my thought was to list out the whole deal in a table (json like @i509VCB mentioned would make sense) and generate off of that, with the light benefit that we wouldn't have ~6k lines of enum variants anymore :sunglasses:

anyawy, if you're planning on putting this down, i might give that idea a try in the next few weeks.

DrChat commented 2 years ago

I made more progress - but in the interest of expediency, I've only made progress that directly impacts my project (changes here). And you raise a good point - my thought was to make the subset enums declare values that are equivalent to their superset variants, i.e.

#[superset="super::Opcode"]
#[repr(usize)]
enum Opcode {
  ADD = super::Opcode::ADD,
  // ...
}

Done implicitly by the macro, of course. I may put some time towards it, but definitely feel free to give it a shot if you're feeling it as well!

iximeow commented 2 years ago

in case you're still watching this, i did finally give this a shot - https://github.com/iximeow/yaxpeax-x86/commit/354df90573693ca70de72705b6a77b4e02b53f01 is the current (still not a full change set) approach. this adds a new x86_generic where the specific modes can be converted up to the generic one. then for Opcode, the most verbose of all this stuff, Display, Colorize, mnemonics, etc, are implemented in terms of the generic Opcode. it also comes with (currently architecture-specific, not sure it has to be) codegen for the "decoding as if this is a specific microarchitecture" feature. that being duplicated for each mode is a pretty substantial portion of the code that's in that diff.

(then there's a fair question of "why generate it with python instead of a proc macro or build.rs?", and the answer is a moral opposition to build-time codegen if it's not necessary. debugging a proc macro is really annoying and i don't like asking people to run build.rs scripts. so, generate when it's updated and commit it. very gopher brain of me. sorry to the rustaceans.)

there's a bit more on top of this commit that i've yet to get to a point i want to push, but i'm convinced that this gets us to a point where yaxpeax-x86 has a useful generic "just try your best" mode without being too much overhead.

i also have a sneaking suspicion that even with the extra source lines, this might reduce the total resulting size of the compiled crate with more than one architecture included. with Opcode unified as it is, there might even be a good chance to unify some of the decode tables.