iximeow / yaxpeax-x86

x86 decoders for the yaxpeax project
BSD Zero Clause License
129 stars 23 forks source link

Incorrect "invalid operand" decoding errors #3

Closed ranweiler closed 4 years ago

ranweiler commented 4 years ago

While single-stepping a simple x86-64 ELF binary on Linux, I ran into (probably) spurious decoding errors with the following byte vectors (presented here with their XED-decoded short string):

"invalid operand":

0fc7642440:  xsavec ptr [esp+0x40]
660febc3:    por xmm0, xmm3
660febc4:    por xmm0, xmm4
660febd3:    por xmm2, xmm3
660f74c1:    pcmpeqb xmm0, xmm1
660ff8c8:    psubb xmm1, xmm0
660ff8d0:    psubb xmm2, xmm0

These all occurred in either ld-2.30.so or libc-2.30.so.

I also ran into this, which is probably expected:

"the decoder is incomplete":

c5f96ec6:    vmovd xmm0, esi

I'm mentioning this last one anyway because it was in libc-2.30.so. It may be worth prioritizing such instructions.

A more verbose script-generated comparison to XED can be found here, with much more detail.

I am using the Default::default()-provided long_mode::InstDecoder, which IIUC is giving me the best possible chance at decoding.

Versions: yaxpeax-86: 0.0.6 rustc: 1.41.0 XED: v10.0-791-gb4109c0

ranweiler commented 4 years ago

Oh, and XED decoding was done just by building XED from the latest source, then invoking the example xed binary against the hex string test cases as xed -d <test-case>.

The yaxpeax repro binary is just this:

use yaxpeax_arch::Decoder;
use yaxpeax_x86::long_mode as amd64;

type Error = Box<dyn std::error::Error>;

fn main() -> Result<(), Error> {
    let args: Vec<_> = std::env::args().collect();

    for arg in &args[1..] {
        let raw_str = arg.to_string().to_lowercase();
        let raw = hex::decode(arg)?;

        let decoder = amd64::InstDecoder::default();

        match decoder.decode(raw) {
            Ok(inst) => {
                println!("{}: {}", raw_str, inst);
            },
            Err(e) => {
                eprintln!("{}: {}", raw_str, e);
            },
        }
    }

    Ok(())
}
iximeow commented 4 years ago

I figured I'd fix this real quick and have a commit, but that didn't happen, so a few notes on what's going on here. This is two distinct incomplete cases in the decoder:

I'm taking this as a motivator to fill out sse and sse2 all in one go, just because they are part of the x86_64 spec in the first place. avx/avx2 will probably be a bit after that, though everything up to operand decoding should be in place.

iximeow commented 4 years ago

barring one-off misreads of documentation, sse and sse2 are now correctly supported. most avx operand codes are now at least not an error. the specific vmovd you ran across is specifically tested, but a lot of avx is :crossed_fingers: still - i've seen a few that should read a trailing immediate that.. don't actually read the immediate.

thank you for the report!

(edit: these changes have been published in yaxpeax-x86 0.0.8)

ranweiler commented 4 years ago

@iximeow, should xsavec be implemented in the above fixes? I'm still running into that one on 0.0.11.

iximeow commented 4 years ago

looks like i'd glazed over xsavec in thinking those were all sse-flavor instructions, agh. it is indeed quite missing, as is cmpxchg{8,16}b, xrstors, xsavec, xsaves, vmptrld, vmptrst, rdseed, and rdrand (all instructions with opcodes of the form `0fc7)

i'm currently working on additions that add support for rdseed and rdrand, so these are all right along the way.

ranweiler commented 4 years ago

Yay! I'll hang tight then, looking forward to giving those a spin when they land.

iximeow commented 4 years ago

xsavec and friends are actually properly supported for real now, with tests to prove it. this and a related operand decode fix are published as yaxpeax 0.0.12.

ranweiler commented 4 years ago

Fix confirmed with 0.0.12 and up. FYI, I can now successfully step through and disassemble every instruction in a "hello world" process on x86-64 Linux!