dump of processes extracted by psscan/poolscanner

cecio commented 4 years ago

Hello.

As agreed, I'm going to open the issue:

I was trying to add to psscan, the same functionalities present in pslist (filters and dump). I did it importing the plugin, and everything look to be working fine...at least on win10 images. I noticed that the "process_dump" method (used instantiating the pslist plugin) works fine if I use it on some Win10 images, but it fails if used with WinXP or Win7 images. The part failing is the

proc_layer_name = proc.add_process_layer()

and the error is "Parent layer is not a translation layer, unable to construct process layer".

As @ikelos reported, the problem is due to the fact that

psscan carves them from the physical space (ie, without knowing the virtual layer), so you'd need to recast the proc into the virtual layer (assuming you know it, and you can figure out the virtual offset for the process). Either that or we're need to add a flag to assume they're physically instantiated instead of virtually.

I'm available for any testing or help on this if you'd like. Thanks a lot!

atcuno commented 4 years ago

Hello @cecio ,

Thank you for taking the time to write a proper ticket.

Being able to find the virtual address for a process given its physical offset is indeed important as rootkits that unlink their process will not be present in the process list (and potentially other sources in the kernel), and having the virtual address is required for any deep per-process analysis like list of handles, memory maps, and so on.

To accomplish this task, Volatility 2 implemented a function named _virtual_process_from_physicaloffset, which can be found here:

https://github.com/volatilityfoundation/volatility/blob/aa6b960c1077e447bda9d64df507ec02f8fcc958/volatility/plugins/taskmods.py#L128

Volatility 2's support for WinXP-8.1 relies on this bounce as those OS versions are scanned at the physical address layer (similar to what @ikelos is quoted as saying). Win10 scans the kernel virtual address space though, which is why the current Vol3 code works on your Windows 10 samples, but not any others.

If you would like to try and adding this capability yourself, then you would want to look these places:

1) how pslist dumps processes: https://github.com/volatilityfoundation/volatility3/blob/master/volatility/framework/plugins/windows/pslist.py#L187

2) where psscan gets its result: https://github.com/volatilityfoundation/volatility3/blob/master/volatility/framework/plugins/windows/psscan.py#L46

If you ported the virtual_process_from_physical_offset function to Vol3 then you should be able to call the pslist.process_dump with your process found through the bounce. You could do this in the existing psscan loop. One thing to be aware of though is that psscan will find terminated process structures, and these cannot be dumped as the page tables no longer are in-tact. The bounce can also fail due to smear during acquisition. So your code would need to catch any failed bounce attempts like Volatility 2 does.

If you decide to add this yourself then please let us know if you have any questions along the way :)

cecio commented 4 years ago

Yes, sure, I'll look into this. I'll let you know if I have questions...thanks! :)

cecio commented 4 years ago

Following what the "virtual_process_from_physical_offset" is doing, I was trying to figure out how to directly extract the _EPROCESS struct for the given physical address, without going through the pool scanner. Snooping around, I was trying something like

virtual_layer_name = self.config['primary']
kvo = self.context.layers[virtual_layer_name].config['kernel_virtual_offset']
ntkrnlmp = self.context.module(self.config["nt_symbols"], layer_name = virtual_layer_name, offset = kvo)

proc = self.context.object(object_type = "_EPROCESS", offset = proc.vol.offset, absolute = True)

The offset is coming from the pool scanner. If I use the same offset with vol2, it works fine. In this case the object returned (proc) seems to be invalid (for example the ThreadListHead is not pointing to anything). Obviously I'm missing something here... I tried to switch to memory_layer for the virtual_layer_name (being on the physical space), but in this case I think that "module building" should be changed...

Any thoughts? :-) Thanks!

atcuno commented 4 years ago

Looking at the psscan code, it seems like you should already have the process object created in the physical translation layer:

https://github.com/volatilityfoundation/volatility3/blob/4a8ce9f1043500afde583fae5da6aa6e258009ee/volatility/framework/plugins/windows/psscan.py#L62

the "mem_object" object there should be of type EPROCESS.

cecio commented 4 years ago

Yes, you are completely right.

I was just trying to build a more generic function (like the virtual_process_from_physical_offset in vol2) that was relying only on the offset, more than already having the object.

But I can start by using the mem_object for the time being, then check if it's possible to go with the other solution in a second iteration.

cecio commented 4 years ago

I think I'm pretty close to a working version.

I'm implementing the sanity checks. One question: I tried to look for an helper function to translate virtual to physical address (like the vtop in vol2). I didn't found it for windows (I saw something for linux and MAC). May be I'm looking in the wrong place.

If it is actually not existing, does it make sense to port it from vol2?

atcuno commented 4 years ago

Yes, you are completely right.

I was just trying to build a more generic function (like the _virtual_process_from_physicaloffset in vol2) that was relying only on the offset, more than already having the object.

But I can start by using the mem_object for the time being, then check if it's possible to go with the other solution in a second iteration.

From what I remember + a quick grep search, the only places that do this for Windows processes in Vol2 are psscan and volshell, so it seems like sending in an already created object makes sense, especially since psscan gets back an object from the poolscanning API.

atcuno commented 4 years ago

I think I'm pretty close to a working version.

I'm implementing the sanity checks. One question: I tried to look for an helper function to translate virtual to physical address (like the vtop in vol2). I didn't found it for windows (I saw something for linux and MAC). May be I'm looking in the wrong place.

If it is actually not existing, does it make sense to port it from vol2?

What did you find for Linux and Mac? All the OSes should use the same APIs so I would like to make sure these are consistent.

There is the translate() function in Vol3 that can give you the same data as vtop.

There is also a function called mapping that can handle translating more than one page at once and returning the physical offset for each. You can see that in use here:

https://github.com/volatilityfoundation/volatility3/blob/752a62a216cbdf1dd357eeed8ea562552612f694/volatility/framework/plugins/windows/pstree.py#L61

cecio commented 4 years ago

What did you find for Linux and Mac? All the OSes should use the same APIs so I would like to make sure these are consistent.

well, may be I just got it wrong. I was grepping something like physical and I saw this:

volatility/framework/automagic/linux.py:    def virtual_to_physical_address(cls, addr: int) -> int:
volatility/framework/automagic/mac.py:    def virtual_to_physical_address(cls, addr: int) -> int:

but may be are referring to something different...

There is the translate() function in Vol3 that can give you the same data as vtop.

There is also a function called mapping that can handle translating more than one page at once and returning the physical offset for each. You can see that in use here

Great! I'll look into these....thanks a lot!!!

cecio commented 4 years ago

Ok, I think I have a working version:

    @classmethod
    def virtual_process_from_physical(cls,
                                      context: interfaces.context.ContextInterface,
                                      layer_name: str,
                                      symbol_table: str,
                                      proc: interfaces.objects.ObjectInterface) -> \
                Iterable[interfaces.objects.ObjectInterface]:

        """ Returns a virtual process from a physical addressed one

        """
        # We'll use the first thread to bounce back to the virtual process
        kvo = context.layers[layer_name].config['kernel_virtual_offset']
        ntkrnlmp = context.module(symbol_table, 
                                  layer_name = layer_name, offset = kvo)

        tleoffset = ntkrnlmp.get_type("_ETHREAD").relative_child_offset("ThreadListEntry")
        # Start out with the member offset given to us from the profile 
        offsets = [tleoffset]

        # If (and only if) we're dealing with 64-bit Windows 7 SP1 
        # then add the other commonly seen member offset to the list 
        kuser = info.Info.get_kuser_structure(context, layer_name, symbol_table)
        nt_major_version = int(kuser.NtMajorVersion)
        nt_minor_version = int(kuser.NtMinorVersion)
        vers = info.Info.get_version_structure(context, layer_name, symbol_table)
        build = vers.MinorVersion
        bits = context.layers[layer_name].bits_per_register
        version = (nt_major_version, nt_minor_version, build)
        if version == (6, 1, 7601) and bits == 64:
            offsets.append(tleoffset + 8)

        # Now we can try to bounce back
        for ofs in offsets:
            ethread = ntkrnlmp.object(object_type = "_ETHREAD", 
                                      offset = proc.ThreadListHead.Flink - ofs,
                                      absolute = True)

            # Ask for the thread's process to get an _EPROCESS with a virtual address space
            virtual_process = ethread.owning_process()
            # Sanity check the bounce.
            # This compares the original offset with the new one (translated from virtual space)
            ph_offset, _, _ = context.layers[layer_name]._translate(virtual_process.vol.offset)
            if virtual_process and \
               proc.vol.offset == ph_offset:
                return virtual_process

It works, at least I can use the PsList process_dump without errors. But it needs more testing: I'm trying to compare some dumps done with this and the same with Vol2, and I see some differences sometimes...I need to investigate a bit more. In the meantime, if you can have a look and you see something obviously wrong...thanks!

atcuno commented 4 years ago

This looks overall reasonable to me. One question though: does translate() work the same here as opposed to _translate()? The underscore name is usually a marker meaning for internal use, so if the non-underscore version is accessible then the code will be a bit cleaner.

Going to tag @ikelos on two questions though:

1) Is there a more direct way to get the major/minor/build?

2) Is using _translate (or translate) okay, or do you want .mapping() used in all instances like this, even for one page?

cecio commented 4 years ago

One question though: does translate() work the same here as opposed to _translate()? The underscore name is usually a marker meaning for internal use, so if the non-underscore version is accessible then the code will be a bit cleaner.

Yes, I switched to .translate, it works fine and it's cleaner as you said. Thanks!

ikelos commented 4 years ago

@atcuno

Well, theoretically it would be stored in the metadata of the symbol file (and we've already found the kernel). Unfortunately, most of our generated files don't contain this data (because it's not in the PDB, I think it needs extracting from the original PE). So this method is probably more accurate, but we could theoretically parse the PE for version info once we've found it when getting the PDB GUID, so freshly generated ones could contain it. The difficulty is, unless it's easy to get, we can't always guarantee it'll be there, and if we can't guarantee it'll be there, the fallback becomes the defacto (hence all the os_distinguisher stuff). 5:S
translate(layer, offset) should effectively equate to mapping(layer, offset, 0) and then the result is mangled for a little easier handling (0 length doesn't technically make sense, but I believe it eases things over a length of 1). So it should be ok, but it should be noted translate isn't part of the TranslationLayerInterface, it's strictly part of LinearlyMappedLayer which is almost certainly fine for the intel layer, but may not be the case when a non-linearly mapped layer (such as compressed memory) enters the scene, so you just need to be a little bit careful about it. As long as everyone's aware of those limitations on translate then I'm ok using it, but for it to work in all cases, it would be better to use mapping and deal with all possible responses.

cecio commented 4 years ago

I'm running some tests to see if I have consistent values. I found a case where I see a strange behavior. I used this memory dump sample, with an hidden process (Windows XP SP3 x86)

http://amnesia.gtisc.gatech.edu/~moyix/ds_fuzz_hidden_proc.img.bz2

I tried to extract the hidden process (PID 1696) with both vol2 and vol3 with the "new" psscan dump. The result is similar but not equal: the file extracted with vol3 seems to have the sections a bit mixed...I see some zeroed areas for example at the beginning of the .text section and then some of the data are "slipping" in the .rdata section.

I tried to trace where this is happening, and in the reconstruct method called by the process_dump I see the call

raw_data = read_layer.read(self.vol.offset, nt_header.OptionalHeader.SizeOfImage, pad = True)

The read method returns these "zeroed" areas I guess because of the "pad" option set. If I try to call the read with pad = False I see an error

*** volatility.framework.exceptions.PagedInvalidAddressException: Page Fault at entry 0xe10ff7f800000400 in page entry

Since I'm not sure if this is happening because of some problems in the virtual_process_from_physical or this is due to something else, I tried some other processes in the same image. For example PID 620, which is not hidden:

with Vol3 psscan (new version with the dump) and pslist (original version) I have the same result (the two files are identical), but both have these "additional" zeroed areas
with Vol2 the dump, the dump does not have the "zeroed" areas

May be there is something to check in the read method or am I missing something? Thanks :)

ikelos commented 4 years ago

So by way of explanation the pad options prevents errors, and returns 0 where a byte could not be read. This is typically whole pages, and happens when a page lookup can't occur. It sounds as though in volatility 2, these were simply being ignored and not output. We could achieve the same in volatility 3, by not including the pad option and then catching any InvalidAddressExceptions and not outputting (it will then also require additional code to try starting it back up after the bad page until it's completed everything it should have tried to read). I don't know which is better, I feel the zeroed data gives a more accurate indication of the size of the various areas, but if we're already truncating the full memory to get down to this, perhaps it would be better to simply ignore it?

If there's just additional padding (null bytes), then that's fine, but if there's data missing in vol3 that's present in vol2, then I'm slightly concerned, and that definitely needs investigating. If the image you're using private, and if not would you be able to get us a copy and clearly provide both commands you use for vol2 and vol3 so we can look into it further please?

cecio commented 4 years ago

The image is available on the Internet, it is a sample I found googling:

http://amnesia.gtisc.gatech.edu/~moyix/ds_fuzz_hidden_proc.img.bz2

Here the commands I'm using:

Vol3

python3 vol.py -f /tmp/ds_fuzz_hidden_proc.img  windows.pslist.PsList --pid 620 --dump

Vol2

python vol.py -f /tmp/ds_fuzz_hidden_proc.img --profile=WinXPSP3x86 procdump -p 620 -D /tmp

Then I also dumped the same process (620) with the "new" PsScan I implemented

python3 vol.py -f /tmp/ds_fuzz_hidden_proc.img  windows.psscan.PsScan --pid 620 --dump

and it returns exactly the same image dumped by the PsList command above.

ikelos commented 4 years ago

Ok, so we're not sure how best to communicate this, but the default (and currently only) mode of dumping for processes is from the memory. If you add --memory to the vol2 command that you used, you'll get identical files for the process. We're not sure when or whether the old functionality will be present for attempting to reconstruct an executable file will be added, but at least I'm assured now that we're not really doing anything differently. This particular difference has already been flagged in #272 if you want to follow along there.

cecio commented 4 years ago

Great! I'll adjust a couple of things and I'll make a pull request for the dump of psscan.

volatilityfoundation / volatility3

dump of processes extracted by psscan/poolscanner #318