qwhai / volatility

Automatically exported from code.google.com/p/volatility
GNU General Public License v2.0
0 stars 0 forks source link

ssdt plugin doesn't work on x64 #189

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Mainly we need to:

* Find KiServiceTable (not exported, no longer referenced by KTHREAD)
* Fix the hard-coded i*4 stuff in ssdt.py
* Compute the function address (no longer absolute pointers like x86)
* Generate syscall lists 

Original issue reported on code.google.com by michael.hale@gmail.com on 23 Jan 2012 at 3:37

GoogleCodeExporter commented 9 years ago
I have this kinda working, but:

* its using hard-coded offsets for KiServiceTable (like connections and sockets 
do for tcpip.sys stuff)
* its a different plugin altogether from ssdt.py (since so much has changed)

$ sudo python vol.py -f /mem.dmp --profile=Win7SP0x64 ssdt64
Volatile Systems Volatility Framework 2.1_alpha
   Entry 0x1000: 0xfffff96000145580 (NtUserGetThreadState) owned by win32k.sys
   Entry 0x1001: 0xfffff96000142630 (NtUserPeekMessage) owned by win32k.sys
   Entry 0x1002: 0xfffff96000153c6c (NtUserCallOneParam) owned by win32k.sys
   Entry 0x1003: 0xfffff96000161dd0 (NtUserGetKeyState) owned by win32k.sys
....

Anyway, its a step in the right direction ;-) More to come later...

Original comment by michael.hale@gmail.com on 24 Jan 2012 at 7:19

GoogleCodeExporter commented 9 years ago
Hey guys, 

Of the original list of requirements:

* Generate syscall lists.....DONE. I modified the scripts to support x64 and 
generated all the NT and GUI syscall lists. These have already been applied to 
trunk. 
* Fix the hard-coded i*4 stuff in ssdt.py....DONE (in the attached patch)
* Compute the function address (no longer absolute pointers like x86)....DONE 
(in the attached patch)
* Find KiServiceTable (not exported, no longer referenced by KTHREAD)....DONE 

So allow me a quick explanation of the last step. On x86 we found the ssdt by 
enumerating threads and looking at _ETHREAD.Tcb.ServiceTable. In x64, you can 
not assign SSDTs per thread like you can on x86, thus there is no longer a 
ServiceTable member. Rootkits often made copies of the ssdt and simply changed 
the ServiceTable pointer to point at the copy. Microsoft removed this 
capability. On x64 if you want to modify the ssdt, you'd have to do it directly 
by overwriting pointers in KeServiceDescriptorTable (NT syscalls) and 
KeServiceDescriptorTableShadow (GUI syscalls). However, then you run into two 
more issues:

* KeServiceDescriptorTable and KeServiceDescriptorTableShadow are no longer 
exported by the NT module (ntoskrnl.exe) on x64 
* You have to bypass patchguard also 

Since we're not hooking, patchguard makes no difference...we're just concerned 
with finding KeServiceDescriptorTable and KeServiceDescriptorTableShadow. If 
they were exported like they are on x86 we could use the new API applied in 
r1322 - something like mod.getprocaddress("KeServiceDescriptorTable"). But they 
are not. As a result, we also cannot expect to find pointers to them in any 
other module besides ntoskrnl.exe. We cannot rely on PDB parsing to find the 
symbols, because that funtionality is not yet applied to trunk. 

So we hope for a reference to KeServiceDescriptorTable and 
KeServiceDescriptorTableShadow from a function that *is* exported. At first, it 
didn't look good. According to IDA's cross-references, these symbols are only 
accessed by one function: KiSystemCall64:

.text:0000000140070EC0                         KiSystemCall64  proc near
.....
.text:0000000140070FF2 4C 8D 15 47 78 23 00                    lea     r10, 
KeServiceDescriptorTable
.text:0000000140070FF9 4C 8D 1D 80 78 23 00                    lea     r11, 
KeServiceDescriptorTableShadow

Unfortunately, KiSystemCall64 is also not exported, so back at square one. 
While poking around, I got lucky and found a reference to the symbols in a 
function that *is* exported - KeAddSystemServiceTable. It doesn't show up in 
IDA's cross-reference list, because the reference is built by combining the 
hard-coded ImageBase with an RVA to the symbol:

public KeAddSystemServiceTable
KeAddSystemServiceTable proc near

TableIndex      = dword ptr  28h

                mov     eax, [rsp+TableIndex]
                cmp     eax, 1
                ja      short cannot_add_table
                mov     r10, rax
                lea     r11, cs:140000000h             ; ImageBase
                shl     r10, 5         
                cmp     qword ptr [r10+r11+2A8840h], 0 ; RVA 
                jnz     short cannot_add_table
                cmp     qword ptr [r10+r11+2A8880h], 0 ; RVA
                jnz     short cannot_add_table
                mov     [r10+r11+2A8880h], rcx
                mov     [r10+r11+2A8888h], rdx
                mov     [r10+r11+2A8890h], r8d
                mov     [r10+r11+2A8898h], r9
                cmp     eax, 1
                jz      short loc_1403E2D42
                mov     [r10+r11+2A8840h], rcx
                mov     [r10+r11+2A8848h], rdx
                mov     [r10+r11+2A8850h], r8d
                mov     [r10+r11+2A8858h], r9

loc_1403E2D42:                         
                mov     al, 1
                retn
; ---------------------------------------------------------------------------
cannot_add_table:                     

                xor     al, al
                retn
KeAddSystemServiceTable endp

In the example, 2A8840h is the RVA to KeServiceDescriptorTable and 2A8880h is 
the RVA to KeServiceDescriptorTableShadow. The two CMP QWORD PTR 
[REG+REG+OFFSET], 0 instructions are 9 byte patterns that occur near the 
beginning of a small function which is exported by ntoskrnl.exe - all factors 
that greatly reduce the possibility of false positives. These instructions will 
always be present because their purpose is to check if the tables are in use 
before adding a new table. Here's the reversed version of the function, showing 
offsets for the structures (added to ssdt_vtypes.py in trunk):

typedef struct _SERVICE_DESCRIPTOR_ENTRY { //Size: 0x20
    /* 0x00 */ PVOID * KiServiceTable;
    /* 0x08 */ DWORD * CounterBaseTable;
    /* 0x10 */ QWORD   ServiceLimit; 
    /* 0x18 */ BYTE  * ArgumentTable;
} _SERVICE_DESCRIPTOR_ENTRY, *PSERVICE_DESCRIPTOR_ENTRY; 

typedef struct _SERVICE_DESCRIPTOR_TABLE { //Size: 0x40
    _SERVICE_DESCRIPTOR_ENTRY Descriptors[2]; 
} _SERVICE_DESCRIPTOR_TABLE;

//These are the symbols we're looking for. 
_SERVICE_DESCRIPTOR_TABLE KeServiceDescriptorTable;
_SERVICE_DESCRIPTOR_TABLE KeServiceDescriptorTableShadow;

BOOL KeAddSystemServiceTable( 
                             PVOID * ServiceTable, 
                             DWORD * CounterTableBase, 
                             DWORD   ServiceLimit,
                             BYTE  * ArgumentTable, 
                             DWORD   TableIndex)
{
    /* The tables must not already be in use */
    if (TableIndex > 1 
        || KeServiceDescriptorTable[TableIndex].KiServiceTable 
        || KeServiceDescriptorTableShadow[TableIndex].KiServiceTable) 
    {
        return FALSE;
    }
    else 
    {
        /* Add the new table to Shadow (GUI) first */
        KeServiceDescriptorTableShadow[TableIndex].KiServiceTable = ServiceTable;
        KeServiceDescriptorTableShadow[TableIndex].CounterTableBase = CounterTableBase;
        KeServiceDescriptorTableShadow[TableIndex].ServiceLimit = ServiceLimit;
        KeServiceDescriptorTableShadow[TableIndex].ArgumentTable = ArgumentTable;
        /* Add the new table to NT next */
        if (TableIndex != 1) 
        { 
            KeServiceDescriptorTable[TableIndex].KiServiceTable = ServiceTable;
            KeServiceDescriptorTable[TableIndex].CounterTableBase = CounterTableBase;
            KeServiceDescriptorTable[TableIndex].ServiceLimit = ServiceLimit;
            KeServiceDescriptorTable[TableIndex].ArgumentTable = ArgumentTable;
        }   
        return TRUE;
    }
    return FALSE;
}

So based on this information, I wrote a patch to ssdt.py that handles x64. 
Here's a summary: 

* It checks the profile's memory model. If x86 it proceeds the way it always 
has - by finding _ETHREAD.Tcb.ServiceTable
* If on x64, it finds the NT module, uses 
getprocaddress("KeAddSystemServiceTable") to locate the exported function 
* If installed, it uses distorm3 to decompose instructions and find the CMP 
QWORD we're looking for. If not installed, we use a generic x64 instruction 
parser using volatility's object model. 
* The two RVAs are extracted from KeAddSystemServiceTable and returned as a 
list. _SERVICE_DESCRIPTOR_TABLE objects are instantiated using the NT module 
base and the RVAs. 
* The _SERVICE_DESCRIPTOR_TABLE objects are added to the same set() as the 
dereferenced _ETHREAD.Tcb.ServiceTable pointers 
* We continue normally through the calculate function 

* Once we get to render_text, the function pointers must be handled differently 
on x64. The indexes into the syscall table are unsigned long on x86 but signed 
long on x64. The signed longs are RVAs from the base of the service table. 

Please let me know if you have any questions/comments and by all means if you 
have x64 images please test it out. You need at least r1322 from the trunk and 
the attached patch.  

Original comment by michael.hale@gmail.com on 31 Jan 2012 at 1:54

Attachments:

GoogleCodeExporter commented 9 years ago
Hmmm, so a couple of things about this patch actually...

First off, there's the distorm dependency.  I realize we rolled it into the 
last release, but if it's going to become an official dependency we should say 
that, and if it's not then that should be commented somewhere in the code so 
that people don't think they can rely on it.  I know that it can work without, 
but then if it can, why bother having it?  Presumably it's because distorm does 
a better job in certain corner cases, but that then worries me that if we don't 
have distorm we'll return bad results.  This probably needs a slightly bigger 
discussion to reach a decision.

Secondly I'd probably have metadata.get('memory_model') default to a result of 
'32bit', since all ASes before that metadata was added were 32bit.

Also, I've just been looking at the patch, so I don't have wider context, but 
why are we accumulating ssdts in a list/set ("ssdts.add(ssdt_obj)") rather than 
yielding them, and if there's a reason, why is there a StopIteration exception 
raised earlier on?

Lastly, do remember there's an address native_type, which should be the right 
size for an address in whichever space it's been initialized.  I'd advocate 
using that over an explicit 'unsigned long', either that or make it clearer in 
the preceding comment why address wouldn't work...

Everything else looks great, and the code is superbly commented as usual, great 
work!  5:)

Original comment by mike.auty@gmail.com on 31 Jan 2012 at 10:00

GoogleCodeExporter commented 9 years ago
Thanks for the comments!

> I realize we rolled it into the last release, but if it's going to become an 
official dependency we should say that

We do mention it on the FAQ (probably as official as it gets) 
http://code.google.com/p/volatility/wiki/FAQ#What_are_the_dependencies_for_runni
ng_Volatility?

> I know that it can work without, but then if it can, why bother having it? 

Hmm this is a difficult question to answer. I think whenever possible we should 
try to use Vol's object model as opposed to having extra dependencies. That 
said, I think there are a few exceptions for when an external library will 
always do a better job - cryptography (PyCrypto) and disassembling (distorm3) 
are two of them. Since distorm3 already has roots in our trunk (for malware 
plugins and volshell) and we're performing a disassembly task, I figured why 
not try to use it. However, assuming someone installs Volatility from source 
(and not one of the precompiled Windows exe w/ distorm3 integrated), they can 
still get by without distorm3 - just some malware plugins and volshell's dis() 
command won't work. I didn't want to also make ssdt on x64 one of those 
non-working plugins, thus the backup mechanism using Vol's object model. 

> that then worries me that if we don't have distorm we'll return bad results

Yeah, like I said distorm3 will always do a better job disassembling. However 
the task in this particular case is really so simple, there's not much chance 
of it failing. Its just a step above doing a simple pattern match like data[0] 
== "\x4B" and data[1] == "\x83" and data[2] == "\xBC" etc...

So yeah, I'm not sure what we want to do about it ;-(

> I'd probably have metadata.get('memory_model') default to a result of '32bit

You bet! I'll fix it up. 

> why are we accumulating ssdts in a list/set ("ssdts.add(ssdt_obj)") rather 
than yielding them

Each thread has a ServiceTable, but usually there are only 2 unique 
ServiceTables (on x86 there can be more if copies are made by malware). We 
don't yield ssdt_obj immediately because then we'd be printing a syscall list 
basically for each thread. 

> why is there a StopIteration exception raised earlier on?

After doing more work on the unique ssdt_obj objects, we eventually yield...so 
the StopIteration still makes sense I think. 

> Lastly, do remember there's an address native_type, which should be the right 
size for an address in whichever space it's been initialized.
> I'd advocate using that over an explicit 'unsigned long',

I can see why this would be confusing. Take another look at the code:

if bits32:
    # These must be unsigned long for x86 because they are absolute
    # function addresses in kernel memory. 
    syscall_addr = obj.Object('unsigned long', table + (i * 4), vm).v()
else:
    # These must be signed long for x64 because they are RVAs relative
    # to the base of the table and can be negative. 
    offset = obj.Object('long', table + (i * 4), vm).v()
    # The offset is the top 20 bits of the 32 bit number. 
    syscall_addr = table + (offset >> 4)

We wouldn't want to use "address" because we actually don't want to read 
different size values per profile. We read i*4 on x86 and x64. Another reason 
(assuming the size thing didn't matter) is that on x86 the 4 byte value must be 
unsigned and on x64 it must be signed. Using "address" wouldn't let us make 
that distinction. 

Original comment by michael.hale@gmail.com on 1 Feb 2012 at 2:54

GoogleCodeExporter commented 9 years ago
Yeah,

Tricky one...  I'm going to let someone else make the call on this one, but for 
now I'm happy having both in place.

Fair enough, as I said, I was only looking at the patch, so didn't have the 
wider context.

Yeah, ok.  As long as there's reason behind a fixed size, that's no problem, 
was just difficult to see why from the code patch.  Since we're already 
splitting them into x86/x64, it may still be worth setting x86 to 'address' so 
as not to confuse people.  The i * 4 I take it is based on struct sizes?  I 
agree it's not worth looking up the struct sizes, given they're so fixed, just 
like to check what's what...  5:)

Original comment by mike.auty@gmail.com on 1 Feb 2012 at 6:47

GoogleCodeExporter commented 9 years ago
Hey Mike, 

I asked AW to take a look and give his opinion on the distorm3 thing. 

> The i * 4 I take it is based on struct sizes?
> I agree it's not worth looking up the struct sizes, given they're so fixed, 
just like to check what's what.

Hmm not necessarily a struct size. Its the size of the value in the table 
"slots". For x86 its the size of a pointer, and for x64 its a 32-bit RVA which 
is added/subtracted (based on signed-ness) to the x64 address (base of NT 
module). So for both architectures its 4. That make sense? I agree its kinda 
strange. 

Original comment by michael.hale@gmail.com on 2 Feb 2012 at 3:43

GoogleCodeExporter commented 9 years ago
New patch with address instead of unsigned long for x86 and changed the default 
memory model to 32bit. 

Original comment by michael.hale@gmail.com on 2 Feb 2012 at 3:48

Attachments:

GoogleCodeExporter commented 9 years ago
For now, let's keep both.  It gives us a lot of flexibility moving forward, in 
case we need to do something more complicated, and it doesn't create a critical 
dependency.

Original comment by labaru...@gmail.com on 2 Feb 2012 at 4:17

GoogleCodeExporter commented 9 years ago
This issue was closed by revision r1339.

Original comment by michael.hale@gmail.com on 2 Feb 2012 at 4:22

GoogleCodeExporter commented 9 years ago
Thanks, the patch looks good with those changes.  5:)

Original comment by mike.auty@gmail.com on 2 Feb 2012 at 8:30