qilingframework / qiling

A True Instrumentable Binary Emulation Framework
https://qiling.io
GNU General Public License v2.0
5.14k stars 744 forks source link

Incorrect memory view after running self-modifying code #561

Closed wonderkun closed 4 years ago

wonderkun commented 4 years ago

*Describe the bug

I notice that there are some bugs when unicorn emulates self-modifying code. So qiling-framework run shellcode with SMC will crash.

Here is a example in this issue https://github.com/unicorn-engine/unicorn/issues/820 .

I found a fix for this issue https://github.com/alxchk/unicorn/commit/a195b312141850d9a7fd892625db2bbec242ff7c, but this commit isn't merged to unicorn which version used by qiling framework. And I don't know whether this commit fix this problem.

Sample Code

ql = Qiling(["x8664_freebsd/bin/x8664_hello_asm"], "x8664_freebsd", output = "dump")
ql.run()

Expected behavior A clear and concise description of what you expected to happen.

Screenshots If applicable, add screenshots to help explain your problem.

Additional context Add any other context about the problem here.

xwings commented 4 years ago

we need @aquynh to merge it.

@wonderkun since this is a unicorn issue, can u provide and test with latest unicorn 1.0.2 to confirm this issue.

wonderkun commented 4 years ago

first test

Yes, Let's test the issue https://github.com/unicorn-engine/unicorn/issues/820 first.

image

python: 3.8.5 unicorn: 1.0.2

Run the code as bellow:

import unicorn
import unicorn.x86_const as x86

sc = bytes.fromhex(
    "dbd0d97424f45fb8e67741bc31c9b15831471a03471a83c704e2138ba93edb742"
    "a5f52911b5f00d10c6f43b7a004012c32688d43f3c7eb6a047bcfed868603ceb7"
    "48560fffb59a5da8b20872dd8f90f9ad1e901e6520b1b0fd7b1132d1f7182c363"
    "dd3c78cc9e201dd32486cd1c091a8d63ae4c024c6fe16561c8b8cf0d72b69003b"
    "adfa0ef0baa512076fde2f8c8e31a6d6b495e28dd58c4e63eacf30dc4e9bdd09e"
    "3c689a39e8c4954170424cd83bef47a0d38fa50609d5708d1720bc6ef22d2b1f0"
    "1e77ed64a22b4210ffda64e0175064e0e7460ca6d7ad862648a641aff7f0917a8"
    "e3b3eec91f12168c2a6f227b61e9d2c6db1664d5b5bf2bb3b0c8388c3cc0a0ea9"
    "c85ca43187344d08b94352419618ff394ff7d2bb777cd31102425e90423679cca"
    "c0ddb5bb2bb7124244495a4b42c95a4f4acc6ccac08bbe9b284a8a11fae2912c8"
    "b0959d08e283f51a92a2e4e44f31286ebdb2ae8efe4170e5e511b2590ed4cb993"
    "1614311fda3c8b573d46323595f0c85c5fe98bc05"
)

def main():
    uc = unicorn.Uc(unicorn.UC_ARCH_X86, unicorn.UC_MODE_32)
    uc.mem_map(0x1000, 0x2000)
    uc.mem_write(0x1000, sc)
    uc.reg_write(x86.UC_X86_REG_ESP, 0x2000)
    uc.emu_start(0x1000, 0, count=0x166)
    out = uc.mem_read(0x1000, 0x2000)

    for x in range(len(sc)):
        print("0x%08x: 0x%02x => 0x%02x" % (x, sc[x],out[x]))
    print(str(out))

if __name__ == "__main__":
    main()

the output is the same as issue https://github.com/unicorn-engine/unicorn/issues/820

0x0000000f: 0x58 => 0x58
0x00000010: 0x31 => 0x31
0x00000011: 0x47 => 0x47
0x00000012: 0x1a => 0x1a
0x00000013: 0x03 => 0x03
0x00000014: 0x47 => 0x47
0x00000015: 0x1a => 0x1a
0x00000016: 0x83 => 0x83
0x00000017: 0xc7 => 0xc7
0x00000018: 0x04 => 0x04
0x00000019: 0xe2 => 0xe2  ; the loop instruction
0x0000001a: 0x13 => 0xf5  ; the correct immediate after decoding
0x0000001b: 0x8b => 0x8b  ; definitely not the `cld` instruction (see `x64dbg` screenshot)
0x0000001c: 0xa9 => 0xa9  ; definitely not a `call` instruction
0x0000001d: 0x3e => 0x3e
0x0000001e: 0xdb => 0x00  ; interestingly enough this byte is correct!
0x0000001f: 0x74 => 0x77  ; ^ that's the first byte of the 2nd xor
0x00000020: 0x2a => 0xc1
0x00000021: 0x5f => 0xa5
0x00000022: 0x52 => 0x89  ; and so is this one (first byte, 3rd xor)
0x00000023: 0x91 => 0xeb
0x00000024: 0x1b => 0xb7

second test

Here is a self-modify shellcode image

The real asm code dump from memory is like this:

image

let's run it in unicorn.

import unicorn
import unicorn.x86_const as x86

sc = bytes.fromhex(
            'bb1c577352d9cbd97424f45e2bc9b13a83c604315e10035e10fea2a899437'
            '50f1aa85ffb063a072f8e738bacf88fdddaee6c760dedbe7b7e1871cd1512'
            'db0c1a84d38fbcbcfe46673ddcd74fc0e1a21bff31234b8a52df3d1e823a2'
            'a96360381fb0d38818232246c9e4e037805ccb144263da113512d546a2943'
            '9a0b661895c947be24fbb8e30928bd805e3e1057998c1280977d382cbf248'
            '02fe04d6745f3b2896164ace56c5db05a4975a618b822883e2773102c5ce7'
            'c7d5fe6cbd54fa5815db17894753f9989842fba365fe29c48be52e75c93e2'
            '834091d18c39340cab8dd3033eae87fe460bd1d55d775310e76ebc9d0676d'
            '9583c3845f7bd9f64784e2b6c1'
)

# print(sc)

def main():
    uc = unicorn.Uc(unicorn.UC_ARCH_X86, unicorn.UC_MODE_32)
    uc.mem_map(0x1000, 0x2000)
    uc.mem_write(0x1000, sc)
    uc.reg_write(x86.UC_X86_REG_ESP, 0x2000)
    uc.emu_start(0x1000,0,count=0x1b)
    out = uc.mem_read(0x1000, 0x2000)

    for x in range(len(sc)):
        print("0x%08x: 0x%02x => 0x%02x" % (x+0x08048054, sc[x],out[x]))
    print(str(out))

if __name__ == "__main__":
    main()

the output is not correct:

0x08048054: 0xbb => 0xbb
0x08048055: 0x1c => 0x1c
0x08048056: 0x57 => 0x57
0x08048057: 0x73 => 0x73
0x08048058: 0x52 => 0x52
0x08048059: 0xd9 => 0xd9
0x0804805a: 0xcb => 0xcb
0x0804805b: 0xd9 => 0xd9
0x0804805c: 0x74 => 0x74
0x0804805d: 0x24 => 0x24
0x0804805e: 0xf4 => 0xf4
0x0804805f: 0x5e => 0x5e
0x08048060: 0x2b => 0x2b
0x08048061: 0xc9 => 0xc9
0x08048062: 0xb1 => 0xb1
0x08048063: 0x3a => 0x3a
0x08048064: 0x83 => 0x83
0x08048065: 0xc6 => 0xc6
0x08048066: 0x04 => 0x04
0x08048067: 0x31 => 0x31
0x08048068: 0x5e => 0x5e
0x08048069: 0x10 => 0x10
0x0804806a: 0x03 => 0x03
0x0804806b: 0x5e => 0x5e
0x0804806c: 0x10 => 0x10
0x0804806d: 0xfe => 0xe2  // loop 
0x0804806e: 0xa2 => 0xf5  
0x0804806f: 0xa8 => 0xa8   // error decode result ! 
0x08048070: 0x99 => 0x99  // error decode result !  ....
0x08048071: 0x43 => 0xbd
0x08048072: 0x75 => 0x39
0x08048073: 0x0f => 0x13
0x08048074: 0x1a => 0xf6
0x08048075: 0xa8 => 0x13
0x08048076: 0x5f => 0xd9
0x08048077: 0xfb => 0xd4
0x08048078: 0x06 => 0xe4
0x08048079: 0x3a => 0xf4
0x0804807a: 0x07 => 0x58
0x0804807b: 0x2f => 0x2b
0x0804807c: 0x8e => 0x49
0x0804807d: 0x73 => 0xb1
0x0804807e: 0x8b => 0x33
0x0804807f: 0xac => 0x83
0x08048080: 0xf8 => 0xe8
0x08048081: 0x8f => 0x8f
0x08048082: 0xdd => 0xdd
0x08048083: 0xda => 0xda
0x08048084: 0xee => 0xee
0x08048085: 0x6c => 0x6c
0x08048086: 0x76 => 0x76
0x08048087: 0x0d => 0x0d
0x08048088: 0xed => 0xed
0x08048089: 0xbe => 0xbe
0x0804808a: 0x7b => 0x7b
0x0804808b: 0x7e => 0x7e
0x0804808c: 0x18 => 0x18
0x0804808d: 0x71 => 0x71
0x0804808e: 0xcd => 0xcd
0x0804808f: 0x15 => 0x15
0x08048090: 0x12 => 0x12
0x08048091: 0xdb => 0xdb
0x08048092: 0x0c => 0x0c

@aquynh @xwings

wonderkun commented 4 years ago

@xwings Anyone pay attention to this issue? I am supposed to fix it with your help.

xwings commented 4 years ago

If this is a issue with Unicorn (very likely it is)

  1. the issue should be raise in Unicorn, community there will take a look into it.
  2. Since this is a open source project, will be be able to raise a PR in Unicorn ?
  3. Test and confirm the fix the for the PR u mention is working. and get @aquynh to merge it ?

Finally, I cannot fix Unicorn :(

wonderkun commented 4 years ago

Thank you very much! I will send a PR to unicron.