vusec / inspectre-gadget

InSpectre Gadget
Apache License 2.0
34 stars 3 forks source link

IR decoding error for WRMSR instructions #7

Open andyhhp opened 7 months ago

andyhhp commented 7 months ago

When trying this scanner on Xen, I've got the 4 instances of the following:

---------------- [ SCANNER ERROR ] ----------------
where: 0xffff82d040201fca     started at:0xffff82d040201c90
IR decoding error at 0xffff82d040201fca. You can hook this instruction with a python replacement using project.hook(0xffff82d040201fca, your_function, length=length_of_instruction).
....
angr.errors.SimIRSBNoDecodeError: IR decoding error at 0xffff82d040201fca. You can hook this instruction with a python replacement using project.hook(0xffff82d040201fca, your_function, length=length_of_instruction).

All reported where's are WRMSR instructions. Full binary and repro details on https://github.com/vusec/inspectre-gadget/issues/6#issue-2050877801

andyhhp commented 7 months ago

After increasing the basic block count, I've found more instructions that are unrecognised:

$ awk '$1 == "IR" { printf("^%s:\n", substr($5, 3, 16)) }' < fail.txt | sort -u | grep -f- <(objdump -d /local/xen.git/xen/xen-syms)
ffff82d040201e98:   0f 30                   wrmsr  
ffff82d040201fca:   0f 30                   wrmsr  
ffff82d040206a47:   0f 0b                   ud2    
ffff82d040206d31:   0f 0b                   ud2    
ffff82d04044d9f0:   0f 78 c0                vmread %rax,%rax
ffff82d04059c32c:   0f 0b                   ud2    

Speculation wise, most wrmsr's are architecturally serialising, and it's probably safe to consider them all to have this properly. ud2 like int3 guarantees to halt speculation. vmread is part of the VT-x instruction set and probably be ignored for now.

andyhhp commented 7 months ago

With the fix for #6 in place, analysis gets further, and this is the new list:

ffff82d040201da2:   0f 30                   wrmsr  
ffff82d040201e02:   0f 30                   wrmsr  
ffff82d040201ed4:   0f 30                   wrmsr  
ffff82d040201f34:   0f 30                   wrmsr  
ffff82d040246610:   0f 0b                   ud2    
ffff82d0402c27a7:   0f 78 c0                vmread %rax,%rax
ffff82d04031ff98:   f3 48 0f ae c8          rdgsbase %rax
ffff82d04032784e:   45 0f 03 e4             lsl    %r12w,%r12d
ffff82d040331567:   0f 0b                   ud2    
ffff82d04034294e:   0f 30                   wrmsr  

I think it's safe to say that angr hasn't encountered much kernel/hypervisor code thus far.

SanWieb commented 7 months ago

I would suggest to separate the IR decoding errors from the other errors and output the locations to a separate file (e.g., unsupported.txt).

The instructions that stop speculation is no need to support for, since its fine to stop the analysis at that point. Other instructions we may have to look indeed if we can add support via valgrind (or maybe pyvex). So let's keep track here of which instructions we want support for.

You agree?

andyhhp commented 7 months ago

Logging them separately is probably a good thing. I've been pointed at https://github.com/angr/vex/commit/e8a55899b890c91d3f243fb98b680afbcde3ee71 as an example of adding support to pyvex but I have to say that the x86 decoder semantics leaves a lot to be desired