volatilityfoundation / volatility3

Volatility 3.0 development
http://volatilityfoundation.org/
Other
2.72k stars 463 forks source link

ProcessHeaps access problems #1128

Open abeDCP opened 7 months ago

abeDCP commented 7 months ago

I am trying to access the content of ProcessHeaps without success, I am using volatility3 Framework 2.5.2, and to test and be sure that all the code works and that only the ProcessHeaps part fails me, what I do is to modify the vadinfo plugin for volatility3 (in particular def list_vads, so that it should only show me the info of the vads that correspond to heaps), I indicate the variants that I am using, the error in all "TypeError: argument of type 'Pointer' is not iterable".

How should I access to ProcessHeaps? Where is the error/problem?

Volatility Version: volatility3 Framework 2.5.2 Operating System: Running volatility (Ubuntu 22.04-1) Python Version: python 3 --version --> Python 3.10.12 Suspected Operating System: (Windows 7 32 bits) Command: python3 vol.py -f ../TestHeap.mem windows.VadInfo_mod.HeapInfo --pid 1484

To Reproduce Modify vadinfo plugin, (def list_vads) Some of the ways I am doing it is the following:

1) Similar logic in Volatility2 works fine: def list_vads( cls, proc: interfaces.objects.ObjectInterface, filterfunc: Callable[ [interfaces.objects.ObjectInterface], bool ] = lambda : False, ) -> Generator[interfaces.objects.ObjectInterface, None, None]:

    peb = proc.get_peb()
    heaps = proc.peb.ProcessHeaps.dereference()
    for vad in proc.get_vad_root().traverse():
        vad_start = vad.get_start()
        if not vad_start in heaps:
            continue
        yield vad

2) ... ... heaps_list = proc.get_peb().ProcessHeaps

for vad in proc.get_vad_root().traverse():
    # Check if the VAD is a heap
    is_heap = any(heap == vad.get_start() for heap in heaps_list)
    if not is_heap:
        continue
    yield vad

3) ... ... heaps = proc.get_peb().ProcessHeaps.dereference()

for heap in heaps:
    for vad in proc.get_vad_root().traverse():
        is_heap = heap.BaseAddress == vad.get_start()
        if not is_heap:
            continue
        yield vad

4) ... ... peb = proc.get_peb() heaps_array_pointer = peb.ProcessHeaps number_of_heaps = peb.NumberOfHeaps

heaps_array = heaps_array_pointer.dereference()
heap_bases = [heap.BaseAddress for heap in heaps_array]

for vad in proc.get_vad_root().traverse():
    if vad.get_start() in heap_bases:
        yield vad

Expected behavior It should show the vadinfo info only about the heaps of the indicated process

Example output For example, for case 1: if not vad_start in heaps: TypeError: argument of type 'Pointer' is not iterable

ikelos commented 7 months ago

So, from the error message alone, (and looking up what ProcessHeaps is), it turns out it's a pointer to a pointer to a void. So first you need to dereference it twice (at the moment you're only doing that once) and then you need to cast the void into an array of whatever type so that it can be iterated over (assuming that's the data structure involved), and to do that you'll need to know the length of the array (unless it's a linked list type affair).

Here's the definition I took from a random windows symbol table, for _PEB.ProcessHeaps:

        "ProcessHeaps": {
          "offset": 144,
          "type": {
            "kind": "pointer",
            "subtype": {
              "kind": "pointer",
              "subtype": {
                "kind": "base",
                "name": "void"
              }
            }
          }
        }

At the moment, we don't have plugins that make use of ProcessHeaps, so you're going to have figure out what structure they take and how to access them appropriately. I hope that helps?

abeDCP commented 7 months ago

So, from the error message alone, (and looking up what ProcessHeaps is), it turns out it's a pointer to a pointer to a void. So first you need to dereference it twice (at the moment you're only doing that once) and then you need to cast the void into an array of whatever type so that it can be iterated over (assuming that's the data structure involved), and to do that you'll need to know the length of the array (unless it's a linked list type affair).

Here's the definition I took from a random windows symbol table, for _PEB.ProcessHeaps:

        "ProcessHeaps": {
          "offset": 144,
          "type": {
            "kind": "pointer",
            "subtype": {
              "kind": "pointer",
              "subtype": {
                "kind": "base",
                "name": "void"
              }
            }
          }
        }

At the moment, we don't have plugins that make use of ProcessHeaps, so you're going to have figure out what structure they take and how to access them appropriately. I hope that helps?

Thanks for your time and answer. How did you took the _PEB.ProcessHeaps structure: _PEB.ProcessHeaps:


        "ProcessHeaps": {
          "offset": 144,
          "type": {
            "kind": "pointer",
            "subtype": {
              "kind": "pointer",
              "subtype": {
                "kind": "base",
                "name": "void"
              }
            }
          }
        }

Length of the array is _PEB.NumberofHeaps

With Volshell: (layer_name) >>> dt(peb) symbol_table_name1!_PEB (584 bytes) 0x0 : InheritedAddressSpace symbol_table_name1!unsigned char 0 0x1 : ReadImageFileExecOptions symbol_table_name1!unsigned char 0 0x2 : BeingDebugged symbol_table_name1!unsigned char 0 0x3 : BitField symbol_table_name1!unsigned char 8 0x3 : ImageUsesLargePages symbol_table_name1!bitfield 0x7ffd5003 ... ... ... 0x88 : NumberOfHeaps symbol_table_name1!unsigned long 10 0x8c : MaximumNumberOfHeaps symbol_table_name1!unsigned long 16 0x90 : ProcessHeaps symbol_table_name1!pointer 1996649728 ... ...

(layer_name) >>> process_heaps = peb.ProcessHeaps.dereference() (layer_name) >>> process_heaps 2293760 (layer_name) >>> dt(process_heaps) symbol_table_name1!pointer (4 bytes) (layer_name) >>> process_heaps_dereference = process_heaps.dereference() (layer_name) >>> process_heaps_dereference <volatility3.framework.objects.Void object at 0x752c0bf43310> (layer_name) >>> dt(process_heaps_dereference) symbol_table_name1!void (0 bytes)

I don't know how to continue. In the context of the _PEB structure, the ProcessHeaps field with an offset of 144 bytes from the start of the structure, points to an array of pointers to the process heaps. Each of these pointers points to an individual heap, allowing the process to access and manage its memory heaps. Each of the pointers in the array pointed to by ProcessHeaps within the _PEB structure refers to an individual _HEAP data structure. These _HEAP structures contain all the information and metadata needed to manage memory allocations within each process heap.

(layer_name) >>> dt("_HEAP") symbol_table_name1!_HEAP (312 bytes) 0x0 : Entry symbol_table_name1!_HEAP_ENTRY 0x8 : SegmentSignature symbol_table_name1!unsigned long 0xc : SegmentFlags symbol_table_name1!unsigned long 0x10 : SegmentListEntry symbol_table_name1!_LIST_ENTRY 0x18 : Heap symbol_table_name1!pointer 0x1c : BaseAddress symbol_table_name1!pointer 0x20 : NumberOfPages symbol_table_name1!unsigned long 0x24 : FirstEntry symbol_table_name1!pointer 0x28 : LastValidEntry symbol_table_name1!pointer 0x2c : NumberOfUnCommittedPages symbol_table_name1!unsigned long 0x30 : NumberOfUnCommittedRanges symbol_table_name1!unsigned long 0x34 : SegmentAllocatorBackTraceIndex symbol_table_name1!unsigned short 0x36 : Reserved symbol_table_name1!unsigned short 0x38 : UCRSegmentList symbol_table_name1!_LIST_ENTRY 0x40 : Flags symbol_table_name1!unsigned long 0x44 : ForceFlags symbol_table_name1!unsigned long 0x48 : CompatibilityFlags symbol_table_name1!unsigned long 0x4c : EncodeFlagMask symbol_table_name1!unsigned long 0x50 : Encoding symbol_table_name1!_HEAP_ENTRY 0x58 : PointerKey symbol_table_name1!unsigned long 0x5c : Interceptor symbol_table_name1!unsigned long 0x60 : VirtualMemoryThreshold symbol_table_name1!unsigned long 0x64 : Signature symbol_table_name1!unsigned long 0x68 : SegmentReserve symbol_table_name1!unsigned long 0x6c : SegmentCommit symbol_table_name1!unsigned long 0x70 : DeCommitFreeBlockThreshold symbol_table_name1!unsigned long 0x74 : DeCommitTotalFreeThreshold symbol_table_name1!unsigned long 0x78 : TotalFreeSize symbol_table_name1!unsigned long 0x7c : MaximumAllocationSize symbol_table_name1!unsigned long 0x80 : ProcessHeapsListIndex symbol_table_name1!unsigned short 0x82 : HeaderValidateLength symbol_table_name1!unsigned short 0x84 : HeaderValidateCopy symbol_table_name1!pointer 0x88 : NextAvailableTagIndex symbol_table_name1!unsigned short 0x8a : MaximumTagIndex symbol_table_name1!unsigned short 0x8c : TagEntries symbol_table_name1!pointer 0x90 : UCRList symbol_table_name1!_LIST_ENTRY 0x98 : AlignRound symbol_table_name1!unsigned long 0x9c : AlignMask symbol_table_name1!unsigned long 0xa0 : VirtualAllocdBlocks symbol_table_name1!_LIST_ENTRY 0xa8 : SegmentList symbol_table_name1!_LIST_ENTRY 0xb0 : AllocatorBackTraceIndex symbol_table_name1!unsigned short 0xb4 : NonDedicatedListLength symbol_table_name1!unsigned long 0xb8 : BlocksIndex symbol_table_name1!pointer 0xbc : UCRIndex symbol_table_name1!pointer 0xc0 : PseudoTagEntries symbol_table_name1!pointer 0xc4 : FreeLists symbol_table_name1!_LIST_ENTRY 0xcc : LockVariable symbol_table_name1!pointer 0xd0 : CommitRoutine symbol_table_name1!pointer 0xd4 : FrontEndHeap symbol_table_name1!pointer 0xd8 : FrontHeapLockCount symbol_table_name1!unsigned short 0xda : FrontEndHeapType symbol_table_name1!unsigned char 0xdc : Counters symbol_table_name1!_HEAP_COUNTERS 0x130 : TuningParameters symbol_table_name1!_HEAP_TUNING_PARAMETERS

Thank you Best Regards.

abeDCP commented 7 months ago

Is it possible that the available attributes are not defined in the ProcessHeaps object?

(layer_name) >>> print(dir(process_heaps_dereference)) ['VolTemplateProxy', '_abstractmethods', 'class', 'delattr', 'dict', 'dir', 'doc', 'eq', 'format', 'ge', 'getattr', 'getattribute', 'gt', 'hash', 'init', 'init_subclass', 'le', 'lt', 'module', 'ne', 'new', 'reduce', 'reduceex', 'repr', 'setattr', 'sizeof', 'str', 'subclasshook', 'weakref', '_abc_impl', '_context', '_vol', 'cast', 'get_symbol_table_name', 'has_member', 'has_valid_member', 'has_valid_members', 'vol', 'write'] (layer_name) >>> print(dir(process_heaps_dereference)) ['VolTemplateProxy', '_abstractmethods', 'class', 'delattr', 'dict', 'dir', 'doc', 'eq', 'format', 'ge', 'getattr', 'getattribute', 'gt', 'hash', 'init', 'init_subclass', 'le', 'lt', 'module', 'ne', 'new', 'reduce', 'reduceex', 'repr', 'setattr', 'sizeof', 'str', 'subclasshook', 'weakref', '_abc_impl', '_context', '_vol', 'cast', 'get_symbol_table_name', 'has_member', 'has_valid_member', 'has_valid_members', 'vol', 'write'] (layer_name) >>> print(dir(process_heaps)) ['VolTemplateProxy', 'PrimitiveObjectnew_value', 'abs', 'abstractmethods', 'add', 'and', 'annotations', 'bool', 'ceil', 'class', 'delattr', 'dict', 'dir', 'divmod', 'doc', 'eq', 'float', 'floor', 'floordiv', 'format', 'ge', 'getattr', 'getattribute', 'getnewargs', 'getnewargs_ex', 'gt', 'hash', 'index', 'init', 'init_subclass', 'int', 'invert', 'le', 'lshift', 'lt', 'mod', 'module', 'mul', 'ne', 'neg', 'new', 'or', 'pos', 'pow', 'radd', 'rand', 'rdivmod', 'reduce', 'reduceex', 'repr', 'rfloordiv', 'rlshift', 'rmod', 'rmul', 'ror', 'round', 'rpow', 'rrshift', 'rshift', 'rsub', 'rtruediv', 'rxor', 'setattr', 'sizeof', 'str', 'sub', 'subclasshook', 'truediv', 'trunc', 'weakref', 'xor', '_abc_impl', '_cache', '_context', '_data_format', '_struct_type', '_unmarshall', '_vol', 'as_integer_ratio', 'bit_count', 'bit_length', 'cast', 'conjugate', 'denominator', 'dereference', 'from_bytes', 'get_symbol_table_name', 'has_member', 'has_valid_member', 'has_valid_members', 'imag', 'is_readable', 'numerator', 'real', 'to_bytes', 'vol', 'write']

Thank you Best Regards.

ikelos commented 7 months ago

You can use volobj.cast method to changer the type of an object, so when you get to the void object, you can cast it to whatever type it should be (for example _HEAP).

I found the _PEB.ProcessHeaps structure from the JSON ISF file stored somewhere in your profile directory (it should say exactly which symbol table was used in the output with -vvvvvvv).

So once you have your void object, I imagine you'd do something like:

>>> heap_object = process_heaps_dereference.cast('_HEAP')
``` and you can then see what methods are available on that.  If it's in fact an array rather than just a single HEAP, you can construct an array first.
Given you've said you know the number of heaps, you should be able to do something along the lines of:

heap_array = process_heaps_dereference.cast('array', count=_PEB.NumberofHeaps, subtype='_HEAP')


I don't remember whether subtype will accept strings or requires a constructed type, but you can use `symbol_table.get_type('_HEAP')` if necessary.

I hope this helps?

abeDCP commented 7 months ago

You can use volobj.cast method to changer the type of an object, so when you get to the void object, you can cast it to whatever type it should be (for example _HEAP).

I found the _PEB.ProcessHeaps structure from the JSON ISF file stored somewhere in your profile directory (it should say exactly which symbol table was used in the output with -vvvvvvv).

So once you have your void object, I imagine you'd do something like:

>>> heap_object = process_heaps_dereference.cast('_HEAP')
``` and you can then see what methods are available on that.  If it's in fact an array rather than just a single HEAP, you can construct an array first.
Given you've said you know the number of heaps, you should be able to do something along the lines of:

heap_array = process_heaps_dereference.cast('array', count=_PEB.NumberofHeaps, subtype='_HEAP')

I don't remember whether subtype will accept strings or requires a constructed type, but you can use `symbol_table.get_type('_HEAP')` if necessary.

I hope this helps?

Wow, thanks for your time and help.

Yes. ProcessHeap is pointer to primary heap of a process and ProcessHeaps: An array of pointers to process heaps. The first entry in this list always points to the same location as ProcessHeap because it is the primary.

So I have first tried with ProcessHeap, (Primary Heap).

peb = proc.get_peb() process_heap_dereference = peb.ProcessHeap.dereference() heap_main = process_heap_dereference.cast('_HEAP')
heap_main_address = heap_main.BaseAddress.dereference()

Result ''' :( Again? ''' <volatility3.framework.objects.Void object at 0x7bb3985d4fd0>

(layer_name) >>> dt("_HEAP") symbol_table_name1!_HEAP (312 bytes) 0x0 : Entry symbol_table_name1!_HEAP_ENTRY 0x8 : SegmentSignature symbol_table_name1!unsigned long 0xc : SegmentFlags symbol_table_name1!unsigned long 0x10 : SegmentListEntry symbol_table_name1!_LIST_ENTRY 0x18 : Heap symbol_table_name1!pointer 0x1c : BaseAddress symbol_table_name1!pointer

How could it continue? I understand that you cannot determine what type of data it is? Because I understand that BaseAddress is actually a pointer according to dt(_HEAP), but if I reference it it seems that it cannot be determined.

Thanks you so much. Best Regards.

ikelos commented 7 months ago

So the BaseAddress is a pointer, but again it looks like the types don't define what type it's pointing to, so you'll need to figure out what you expect that to be, and then try heap_main_address.cast('<type_you_expect_it_to_be>). However, since it says it's an address, my guess is that you've got the offset right there? You can treat heap_main_address as a number, and from the looks of your output, it's 0x7bb3985d4fd0? What type of structure lives at that address or what data's there, I'm not sure? Presumably that's where the actual heap data is? Also, bear in mind, that I thought ProcessHeap was an array of pointer-to-pointers. You only appear to have one level of dereference applied, so cast that pointer as a HEAP may just cause it to read memory and try and treat it like it's a HEAP. There's no way of error checking the data, so please be careful that you understand what lives at which address in memory...

abeDCP commented 7 months ago

Thanks ikelos for your help and time. I'm trying to get array of heaps as you told me

"heap_array = process_heaps_dereference.cast('array', count=_PEB.NumberofHeaps, subtype='_HEAP') I don't remember whether subtype will accept strings or requires a constructed type, but you can use symbol_table.get_type('_HEAP') if necessary."

But I have problems; peb = proc.get_peb() number_of_heaps = peb.NumberOfHeaps process_heaps_pointer = peb.ProcessHeaps.dereference() heap_array = process_heaps_pointer.cast('array', count=number_of_heaps, subtype='_HEAP') heap_addresses = []

    for heap_index, process_heap in enumerate(heap_array):
        heap_entry_address = process_heap.Entry.vol.offset

The error: heap_array = process_heaps_pointer.cast('array', count=number_of_heaps, subtype='_HEAP'). AttributeError: ObjectTemplate object has no attribute size

If I use: peb = proc.get_peb() number_of_heaps = peb.NumberOfHeaps process_heaps_pointer = peb.ProcessHeaps.dereference() heap_type = symbol_table.get_type('_HEAP') heap_array = process_heaps_pointer.cast('array', count=number_of_heaps, subtype=heap_type)

This error: heap_type = symbol_table.get_type('_HEAP') ^^^^^^^^^^^^ NameError: name 'symbol_table' is not defined

What's wrong? How Can I continue?

Thank you Best Regards

ikelos commented 7 months ago

Sorry, the symbol_table doesn't exist in volshell by default, it was more a stand in for people that use volshell often. You need to know the name of the symbol table you want, and then look it up in the context's symbol space using its name.

You'd want to do:

symbol_table = self.context.symbol_space[self.current_symbol_table]
heap_type = symbol_table.get_type('_HEAP')
heap_array = process_heaps_pointer.cast('array', count=number_of_heaps, subtype=heap_type)

As to why the earlier one didn't work, that's stranger, I thought all ObjectTemplates should have a size, so I'll need to investigate that one...

ikelos commented 7 months ago

Ok, so subtypes need to be actual subtypes rather than just names of subtypes. I'll look into whether we can improve this. In the rest of our code we tend to be inside an object (so self already points to an object) and then do:

subtype=self._context.symbol_space.get_type(
                self.get_symbol_table_name() + constants.BANG + "<NAME_OF_TYPE>"
            ),
abeDCP commented 6 months ago

Hi ikelos,

Thanks for your time. I think I have done the subtype correctly, but I am not sure if it is correct, peb = proc.get_peb() number_of_heaps = peb.NumberOfHeaps process_heaps_dereference = peb.ProcessHeaps.dereference() process_heaps_pointer = process_heaps_dereference.dereference() symbol_table = proc._context.symbol_space[proc.get_symbol_table_name()] heap_type = symbol_table.get_type('_HEAP') heap_array = process_heaps_pointer.cast('array', count=number_of_heaps, subtype=heap_type)

I will tell you the tests I am doing, (comparing results with volatility2 and volatility3) To be sure that I am really getting the right value.

With Vol2: In the render_text() method, I look for this loop: for vad, _addrspace in task.get_vads(vad_filter = filter, skip_max_commit = True): Before the loop, I add this line (I Get all Heaps of a process) :

heaps = task.Peb.ProcessHeaps.dereference()

In the loop, after the first test, add those lines:

if self._config.HEAPS and not vad.Start in heaps: continue

Basically, when I set the HEAP flag and the PID, it lists the info of the vads that belong to a HEAP. It works well, as expected, and also matches the vadtree info and the VadS tag.

But, Vol3: Modify list_vads in vadinfo for get only VADs that belong to a HEAP

` def list_vads( cls, proc: interfaces.objects.ObjectInterface, filterfunc: Callable[ [interfaces.objects.ObjectInterface], bool ] = lambda : False, ) -> Generator[interfaces.objects.ObjectInterface, None, None]:

    peb = proc.get_peb()
    number_of_heaps = peb.NumberOfHeaps
    process_heaps_dereference = peb.ProcessHeaps.dereference()
    process_heaps_pointer = process_heaps_dereference.dereference()
    symbol_table = proc._context.symbol_space[proc.get_symbol_table_name()]
    heap_type = symbol_table.get_type('_HEAP')
    heap_array = process_heaps_pointer.cast('array', count=number_of_heaps, subtype=heap_type)
    heap_addresses = []

    for heap_index, process_heap in enumerate(heap_array):
        heap_entry_address = process_heap.Entry.vol.offset
        heap_addresses.append(heap_entry_address)
        print(f"Heap {heap_index} Entry Address: {heap_entry_address}")

        for vad in proc.get_vad_root().traverse():
            if vad.get_start() in heap_addresses and not filter_func(vad):
                yield vad`

The result is that only the main HEAP 0x400000 is done correctly, the rest do not give me the correct values (Below you can see the Decimal address and how it always shows the resulting VadInfo from the Main Heap, Address: 4194304 is 0x400000).

`PID Process Offset Start VPN End VPN Tag Protection CommitCharge PrivateMemory Parent File File output Heap 0 Entry Address: 4194304 3764 TFM_6.4.exe 0x850fa4a8 0x400000 0x4fffff VadS PAGE_READWRITE 4 1 0x87483628 N/A Disabled

Heap 1 Entry Address: 4194616 3764 TFM_6.4.exe 0x850fa4a8 0x400000 0x4fffff VadS PAGE_READWRITE 4 1 0x87483628 N/A Disabled

Heap 2 Entry Address: 4194928 3764 TFM_6.4.exe 0x850fa4a8 0x400000 0x4fffff VadS PAGE_READWRITE 4 1 0x87483628 N/A Disabled

Heap 3 Entry Address: 4195240 3764 TFM_6.4.exe 0x850fa4a8 0x400000 0x4fffff VadS PAGE_READWRITE 4 1 0x87483628 N/A Disabled`

So it makes me suspect that something is not working as expected.

The correct are (0x240000, 0x30000, 0x400000 and 0x6b0000 )

Thank you Best Regards.

abeDCP commented 6 months ago

Hi Ikelos,

Nothing, I am not able to find where the bug is and how I could solve it, with volatility2 I can get the Heaps (Modifying the VadInfo plugin and comparing If VAD start belongs to ProcessHeap) at:

0x240000 0x30000 0x400000 0x6b0000

Now, if I look with Volatility3 using ProcessHeap as follows:

`
peb = proc.get_peb() number_of_heaps = peb.NumberOfHeaps process_heaps_dereference = peb.ProcessHeaps.dereference() process_heaps_pointer = process_heaps_dereference.dereference() symbol_table = proc._context.symbol_space[proc.get_symbol_table_name()]) heap_type = symbol_table.get_type('_HEAP') heap_array = process_heaps_pointer.cast('array', count=number_of_heaps, subtype=heap_type) heap_addresses = []

    for heap_index, process_heap in enumerate(heap_array):
        heap_entry_address = process_heap.cast('_HEAP')
        heap_addresses.append(heap_entry_address)
        print(f ‘Heap {heap_index} Entry Address: {heap_entry_address}’)

`

I get the following addresses:

0x400000 0x400138 0x400270 0x4003a8

Only 0x400000 matches, which is the Main Heap that is created when the Process is created. In fact, what the volatility3 code is doing is taking the main heap and indicating that the rest of the Heaps are correlative, adding up to 312bytes. Maybe the bug is in the way the array is obtained?

Thank you Best Regards.

github-actions[bot] commented 4 days ago

This issue is stale because it has been open for 200 days with no activity.