volatilityfoundation / volatility3

Volatility 3.0 development
http://volatilityfoundation.org/
Other
2.62k stars 448 forks source link

Understand then document or fix why some plugins don't work with --config #1294

Open atcuno opened 2 weeks ago

atcuno commented 2 weeks ago

@ikelos in the workshops, we show --save-config and --config early on when showing new Vol3 features so that people get the performance benefit when running many plugins to solve the labs/exercises.

This then leads them to running plugins like mftscan, the base yarascan, etc. with --config and the plugin not working.

I know at some point we discussed why this happened, but I cannot find it anywhere (it might have been on a call). I think I remember it being something about plugins that don't access kernel memory and just scan a physical layer, so they don't build the full config but I could be off about that.

With this issue in mind, I think we should either:

1) Fix it to work with these affected plugins

Or if 1) is too difficult or undesirable, then:

2a) Document it in our read the docs, so that we can link to it in training materials, tickets, Slack, etc.

2b) If possible, produce a warning when these plugins are run and --config is set. I know direct command line argument access is abstracted a bit from plugins, but if these plugins could check in terms of "did the user set --config" then the plugins could warn not to use --config with them and to use -f directly instead. This would make full automation of plugins a bit weird, but switching from --config to -f for users running plugins directly from the command line should be pretty easy.

Thoughts? Other options? I think this will lead to documentation either way, but I don't have a strong opinion on fixing it outright or making people switch between --config and -f as long as the plugins can warn about it, instead of just failing in a weird way like now.

ikelos commented 2 weeks ago

Yep, it's all from when we introduced the ModuleRequirement. Some plugins don't need a kernel/OS, so just ask for a LayerRequirement, and others do. The configs are just trees of values, which are meaningless if the tree doesn't match up with what the plugin asks for. I'm kinda slowly working on fixing that, but it would be nice to be able to generate configs for virtual machines found by vmscan and have them work with all the other plugins.

To do that, I've already introduced a small change recently where TranslationLayerRequirement/SymbolTableRequirement/ModuleRequirements now record what type of requirement they were when the config is recorded. That should help us reuse some down the line, but it's really fluke that these all work (because all plugins by convention alone, use the same name for their ModuleRequirement, and for their LayerRequirements). It's exceptionally brittle and people became aware of it and started using it before it got a lot of love/attention.

Generally a --config is only guaranteed good for the same version of the same plugin as was used to generate it. That's it, that's as much as you get. By fluke that happens to be enough to be reusable across a whole bunch of plugins, but that's literally fluke. There is no surefire way of ensure that an arbitrary plugin's requirements will line up with another plugin's requirements. The best we can do is try to make them fit, and if not run the whole process over again (which we do, so no one should find that things don't work, they may just take longer).

Sadly "fix it" isn't really an option. Happy to document more clearly what it does, but I'm concerned that if people didn't realize exactly how it worked themselves, it might have been taught as some kind of panacea, rather than what it is a mechanism for rebuilding the state of a singular run from a particular plugin.

  1. is kinda light on details... 5;P It's not really broken, it's doing exactly what it was supposed to, just not necessarily what people thought it did?

2a) Yep, probably explaining to people what it does might help them understand, but it may not placate them.

2b) I don't think this will be possible, and I don't think that's a problem either. The complaint is not that they get bad data it's that it doesn't speed anything up? If they want to manually tinker with their config to match up the requirements of the plugin, that's fine, asking volatility to do that automatically is tricky and is on the cards, but not in the "parity-release" time frame, I'm afraid.

Plugins only fail if they only provide a config generated from a run of a different plugin? Where did they get the idea that configs were transferable between plugins? The --help is woefully unspecific about what a configuration is and how it works, and I can see how people might assume that a configuration is global, so those should be quick to make more accurate but otherwise I don't think --config is really documented anywhere?

atcuno commented 2 weeks ago

Hm yea, we (the core team) need to figure it out better then and document it as we have been telling people in workshops, since it seems to work in all cases... that you can run, for windows samples, windows.pslist with --save-config and then going forward just use --config for all plugins except for the physical memory scanning plugins.

You get correct output with the plugins besides the ones that do the scanning.

The ones that do the scanning (mftscan, base yarascan, ...) fail to execute and say about missing requirements, which makes more sense now given what you said in your reply.

Given what you said, I think we should have documentation as a parity release goal and then explore other options later.

atcuno commented 2 weeks ago

Follow up: the reason we use this feature is that scanning to find the version is really slow when running 15-20 plugins separately versus with --config you get plugin results starting to produce basically immediately. In Vol2, you can do this with setting DTB, KDBG, and others in a config or env vars, but it was a manual and painful process. It seemed like --config was meant to automate this (among other benefits), and like I said before it works except the ones that scan physical memory.

eve-mem commented 2 weeks ago

Just a thought.

Would it be a bad thing to change the mftscan plugins etc to use a module requirement? Then they can reuse the module information for the other plugins? They only really need the translation layer though.

I guess it then leads to everyone thinking thay config saving is magic that works all the time. (It is kind of magical though, at least for me 😁)

🦊

atcuno commented 2 weeks ago

That is an interesting thought too. Should these plugins have had those requirement(s) all along? I think only ikelos knows this though.

eve-mem commented 2 weeks ago

@ikelos comment here though:

it's really fluke that these all work (because all plugins by convention alone, use the same name for their ModuleRequirement, and for their LayerRequirements). It's exceptionally brittle and people became aware of it and started using it before it got a lot of love/attention.

So if that's the way to go, everyone needs to know exactly why it's working and what would break it.

ikelos commented 2 weeks ago

If they can't fulfill the ModuleRequirement (ie, we can't find the kernel) then the plugins wouldn't run, which for physical scanners is precisely what you want them to do. Better to find a way to allow ModuleRequirements to have the LayerRequirements in a config tested as a way to fulfil a LayerRequirement of a plugin, and to allow a LayerRequirement in a config to partially fulfil a ModuleRequirement in a plugin. That's exactly what I'm hoping to do, but even as complicated as it sounds, it'll probably be more difficult than that. Hence, not ready in two weeks I'm afraid...

ikelos commented 2 weeks ago

So if that's the way to go, everyone needs to know exactly why it's working and what would break it.

A big part of why it works, is because we got everyone to write their plugins asking for something called primary and now for a module called kernel. Since it's usually the same type of requirement, the config trees that they'd each generate happen to look pretty much identical. That's what's been saving us all this time. If someone made a plugin that didn't call its ModuleRequirement kernel or didn't calls its TranslationLayerRequirement primary, then none of the generated configs would work with it. We're lucky we strongly suggested it, only now people see it as magic and get angry when it turns out Oz wasn't a wizard after all...

atcuno commented 2 weeks ago

@ikelos comment here though:

it's really fluke that these all work (because all plugins by convention alone, use the same name for their ModuleRequirement, and for their LayerRequirements). It's exceptionally brittle and people became aware of it and started using it before it got a lot of love/attention.

So if that's the way to go, everyone needs to know exactly why it's working and what would break it.

Yes, this is what we need clarification from ikelos on.

From my view, the point of the --config was to prevent us from needing to scan the sample for symbols, offsets, etc. again for any future plugin runs against a sample. My view for this came from how we used this in vol2 and that --config worked this way except for full-sample scanning plugins.

I am not sure what purpose of --config is if the design is to be for the same plugin to run multiple times against the same sample as users would/should save the output of the first run to prevent having to re-run the plugin over.

Given that --config isn't documented now as noted above and only has been referenced publicly in our couple workshops (which we warned students vol3 isn't any final form yet), my preference would be for us to make it where as many plugins as possible can be supported by a config json generated after one plugin run.

This will greatly enhance the vol3 user experience as plugins will run much faster. For example, on many Windows samples that are 32GB+, which is what we normally see in real world systems, windows.pslist can take multiple minutes to produce a process list. With --config, it takes a few seconds as the huge sample isn't being scanned for PDB and other info.

To make this the most sane, I think these are our best choices.

1) We should write a plugin like generate_config (or one per OS) that uses the --save-config arg to just generate a config for a sample. If people don't like this idea, then we should document a plugin known to produce a good config per OS, like windows.pslist for Windows samples.

2) We should document and verify that:

ikelos commented since I started typing this and I agree we are basically "lucky" now, but it also means it wouldn't take much for us to standardize what plugins should do to support --config and then convert whatever lacking plugins to it that can be.

ikelos commented 2 weeks ago
  1. We should write a plugin like generate_config (or one per OS) that uses the --save-config arg to just generate a config for a sample. If people don't like this idea, then we should document a plugin known to produce a good config per OS, like windows.pslist for Windows samples.

I'm not sure you got the distinction here. We currently have approximately two classes of plugin, those that do not need a kernel, and those that do.

Those that need a kernel (that all happen to have the requirement named kernel):

volatility3/framework/plugins/mac/check_trap_table.py
volatility3/framework/plugins/mac/kevents.py
volatility3/framework/plugins/mac/lsmod.py
volatility3/framework/plugins/mac/mount.py
volatility3/framework/plugins/mac/kauth_scopes.py
volatility3/framework/plugins/mac/ifconfig.py
volatility3/framework/plugins/mac/dmesg.py
volatility3/framework/plugins/mac/netstat.py
volatility3/framework/plugins/mac/socket_filters.py
volatility3/framework/plugins/mac/pstree.py
volatility3/framework/plugins/mac/vfsevents.py
volatility3/framework/plugins/mac/check_syscall.py
volatility3/framework/plugins/mac/proc_maps.py
volatility3/framework/plugins/mac/malfind.py
volatility3/framework/plugins/mac/kauth_listeners.py
volatility3/framework/plugins/mac/bash.py
volatility3/framework/plugins/mac/lsof.py
volatility3/framework/plugins/mac/trustedbsd.py
volatility3/framework/plugins/mac/psaux.py
volatility3/framework/plugins/mac/check_sysctl.py
volatility3/framework/plugins/mac/timers.py
volatility3/framework/plugins/mac/list_files.py
volatility3/framework/plugins/mac/pslist.py
volatility3/framework/plugins/windows/orphan_kernel_threads.py
volatility3/framework/plugins/windows/modules.py
volatility3/framework/plugins/windows/sessions.py
volatility3/framework/plugins/windows/kpcrs.py
volatility3/framework/plugins/windows/filescan.py
volatility3/framework/plugins/windows/privileges.py
volatility3/framework/plugins/windows/debugregisters.py
volatility3/framework/plugins/windows/getsids.py
volatility3/framework/plugins/windows/pe_symbols.py
volatility3/framework/plugins/windows/vadinfo.py
volatility3/framework/plugins/windows/unloadedmodules.py
volatility3/framework/plugins/windows/registry/getcellroutine.py
volatility3/framework/plugins/windows/registry/userassist.py
volatility3/framework/plugins/windows/registry/hivelist.py
volatility3/framework/plugins/windows/registry/printkey.py
volatility3/framework/plugins/windows/registry/hivescan.py
volatility3/framework/plugins/windows/verinfo.py
volatility3/framework/plugins/windows/svcscan.py
volatility3/framework/plugins/windows/netstat.py
volatility3/framework/plugins/windows/info.py
volatility3/framework/plugins/windows/pstree.py
volatility3/framework/plugins/windows/bigpools.py
volatility3/framework/plugins/windows/joblinks.py
volatility3/framework/plugins/windows/suspicious_threads.py
volatility3/framework/plugins/windows/envars.py
volatility3/framework/plugins/windows/driverscan.py
volatility3/framework/plugins/windows/vadyarascan.py
volatility3/framework/plugins/windows/processghosting.py
volatility3/framework/plugins/windows/hollowprocesses.py
volatility3/framework/plugins/windows/svcdiff.py
volatility3/framework/plugins/windows/modscan.py
volatility3/framework/plugins/windows/psscan.py
volatility3/framework/plugins/windows/thrdscan.py
volatility3/framework/plugins/windows/vadwalk.py
volatility3/framework/plugins/windows/unhooked_system_calls.py
volatility3/framework/plugins/windows/devicetree.py
volatility3/framework/plugins/windows/svclist.py
volatility3/framework/plugins/windows/malfind.py
volatility3/framework/plugins/windows/hashdump.py
volatility3/framework/plugins/windows/cachedump.py
volatility3/framework/plugins/windows/iat.py
volatility3/framework/plugins/windows/skeleton_key_check.py
volatility3/framework/plugins/windows/shimcachemem.py
volatility3/framework/plugins/windows/dlllist.py
volatility3/framework/plugins/windows/cmdline.py
volatility3/framework/plugins/windows/symlinkscan.py
volatility3/framework/plugins/windows/driverirp.py
volatility3/framework/plugins/windows/timers.py
volatility3/framework/plugins/windows/mbrscan.py
volatility3/framework/plugins/windows/handles.py
volatility3/framework/plugins/windows/drivermodule.py
volatility3/framework/plugins/windows/virtmap.py
volatility3/framework/plugins/windows/poolscanner.py
volatility3/framework/plugins/windows/threads.py
volatility3/framework/plugins/windows/strings.py
volatility3/framework/plugins/windows/ldrmodules.py
volatility3/framework/plugins/windows/getservicesids.py
volatility3/framework/plugins/windows/psxview.py
volatility3/framework/plugins/windows/ssdt.py
volatility3/framework/plugins/windows/netscan.py
volatility3/framework/plugins/windows/pslist.py
volatility3/framework/plugins/windows/pedump.py
volatility3/framework/plugins/windows/lsadump.py
volatility3/framework/plugins/windows/dumpfiles.py
volatility3/framework/plugins/windows/truecrypt.py
volatility3/framework/plugins/windows/memmap.py
volatility3/framework/plugins/windows/mutantscan.py
volatility3/framework/plugins/windows/callbacks.py
volatility3/framework/plugins/linux/check_creds.py
volatility3/framework/plugins/linux/netfilter.py
volatility3/framework/plugins/linux/pidhashtable.py
volatility3/framework/plugins/linux/lsmod.py
volatility3/framework/plugins/linux/capabilities.py
volatility3/framework/plugins/linux/mountinfo.py
volatility3/framework/plugins/linux/pagecache.py
volatility3/framework/plugins/linux/tty_check.py
volatility3/framework/plugins/linux/library_list.py
volatility3/framework/plugins/linux/sockstat.py
volatility3/framework/plugins/linux/iomem.py
volatility3/framework/plugins/linux/vmayarascan.py
volatility3/framework/plugins/linux/elfs.py
volatility3/framework/plugins/linux/pstree.py
volatility3/framework/plugins/linux/envars.py
volatility3/framework/plugins/linux/check_syscall.py
volatility3/framework/plugins/linux/proc.py
volatility3/framework/plugins/linux/psscan.py
volatility3/framework/plugins/linux/check_idt.py
volatility3/framework/plugins/linux/malfind.py
volatility3/framework/plugins/linux/bash.py
volatility3/framework/plugins/linux/lsof.py
volatility3/framework/plugins/linux/psaux.py
volatility3/framework/plugins/linux/check_afinfo.py
volatility3/framework/plugins/linux/keyboard_notifiers.py
volatility3/framework/plugins/linux/ebpf.py
volatility3/framework/plugins/linux/kmsg.py
volatility3/framework/plugins/linux/pslist.py
volatility3/framework/plugins/linux/check_modules.py
volatility3/framework/plugins/yarascan.py

Those that don't:

volatility3/framework/plugins/vmscan.py
volatility3/framework/plugins/windows/crashinfo.py
volatility3/framework/plugins/windows/mftscan.py
volatility3/framework/plugins/layerwriter.py
volatility3/framework/plugins/configwriter.py
volatility3/framework/plugins/banners.py
volatility3/framework/plugins/yarascan.py

So realistically, we're talking about yarascan, crashinfo and mftscan (please also note, the configwriter plugin, which already exists but only generates configs for the other plugins in its own class).

A plugin for one of those classes, should work for all of those in the same class, so the vast majority of plugins will work with a config created by a similar class of plugin (which is why this feature has sat in the code base for several years and no one shouted so loudly about it until now).

I don't particularly like the idea of a separate generate_config plugin (even though we tehcnically already have one), particularly not one per OS, because the difference isn't between OS, it's between needing a kernel and not needing a kernel and we will never be able to produce a one size fits all config and still allow plugins to have the flexibility of different requirements. They're fundamentally opposed and it misunderstands the point of a configuration system to supposed that everything can be shoehorned into one box.

We could try to fudge a config that includes the same data twice at different points in the tree, but that's like saying we should make scissors come with four handles, because some people are left handed. It might be doable, but it's not a good solution for a problem that's just not that big of an issue as long as people aren't making assumptions about volatility 3 working like volatility 2 (sighs).

2. We should document and verify that:

The point is, we kinda have already done all that you're suggesting, and it's not really a big deal to generate an appropriate config for the few plugins that don't need a kernel. The biggest step would be to document how the feature works clearly, so that people don't make incorrect assumptions...

ikelos commented 2 weeks ago

To do that, I've already introduced a small change recently where TranslationLayerRequirement/SymbolTableRequirement/ModuleRequirements now record what type of requirement they were when the config is recorded. That should help us reuse some down the line, but it's really fluke that these all work (because all plugins by convention alone, use the same name for their ModuleRequirement, and for their LayerRequirements). It's exceptionally brittle and people became aware of it and started using it before it got a lot of love/attention.

If they can't fulfill the ModuleRequirement (ie, we can't find the kernel) then the plugins wouldn't run, which for physical scanners is precisely what you want them to do. Better to find a way to allow ModuleRequirements to have the LayerRequirements in a config tested as a way to fulfil a LayerRequirement of a plugin, and to allow a LayerRequirement in a config to partially fulfil a ModuleRequirement in a plugin. That's exactly what I'm hoping to do, but even as complicated as it sounds, it'll probably be more difficult than that. Hence, not ready in two weeks I'm afraid...

Please also note I've already start laying the ground work for a way to make better use of what we know in a config, but it will take time.

eve-mem commented 1 week ago

Yarascan is the generic scanner, you wouldn't want that to have kernel for sure. There is a vad yarascan for windows specifically and that would reuse the config. (And vmayarascan for linux, there isn't a mac one yet i don't think)

Crash info is more about the crash file itself. Feels okay not to use the config.

So it's really just mftscan, and to me that feels fine not to reuse kernel.

Some more documentation to explain and in the future if yarascan/mftscan could reuse some of the layer parts as @ikelos is working on then that might save some time.

It feels like yarascan is the only one you'd want to run multiple times (with different rules), so adding some documentation for that one plugin specifically saying to make a 'scanning config' for it and then it's resolved? Then if/when we can have a way for translation requirements to borrow layer information from kernel requirements safely it works even better?

I can totally see that running yarascan 10 times in a workshop (or actual work) with each time the automagic taking 5 minutes might feel frustrating.

eve-mem commented 1 week ago

@atcuno if it helps at all here's a rough way to convert a 'module' config to a 'translation layer' config.

Here's a config from a pslist:

{
  "dump": false,
  "kernel.layer_name.class": "volatility3.framework.layers.intel.WindowsIntel",
  "kernel.layer_name.kernel_virtual_offset": 2152558592,
  "kernel.layer_name.memory_layer.class": "volatility3.framework.layers.physical.FileLayer",
  "kernel.layer_name.memory_layer.location": "file:///home/eve/Documents/volatility3/win-xp-laptop-2005-06-25.img",
  "kernel.layer_name.page_map_offset": 233472,
  "kernel.layer_name.swap_layers": true,
  "kernel.layer_name.swap_layers.number_of_elements": 0,
  "kernel.offset": 2152558592,
  "kernel.symbol_table_name.class": "volatility3.framework.symbols.windows.WindowsKernelIntermedSymbols",
  "kernel.symbol_table_name.isf_url": "file:///home/eve/Documents/volatility3/volatility3/symbols/windows/ntoskrnl.pdb/32962337F0F646388B39535CD8DD70E8-2.json.xz",
  "kernel.symbol_table_name.symbol_mask": 0,
  "physical": false,
  "pid": []
}

Here is a config from mftscan:

{
  "primary.class": "volatility3.framework.layers.intel.WindowsIntel",
  "primary.memory_layer.class": "volatility3.framework.layers.physical.FileLayer",
  "primary.memory_layer.location": "file:///home/eve/Documents/volatility3/win-xp-laptop-2005-06-25.img",
  "primary.page_map_offset": 233472,
  "primary.swap_layers": true,
  "primary.swap_layers.number_of_elements": 0,
  "yarascanner": false
}

A quick and dirty way to create a config that will work for a 'translation layer' requirements plugin is to do the following sed

sed -e 's/kernel.layer_name/primary/g' pslist.json >scan.json

Where scan.json now looks like this. That makes all sorts of assumptions about the layer_name used and will leave left over config such as the kernel offset in there - but it should work for now.

{
"dump": false,
"primary.class": "volatility3.framework.layers.intel.WindowsIntel",
"primary.kernel_virtual_offset": 2152558592,
"primary.memory_layer.class": "volatility3.framework.layers.physical.FileLayer",
"primary.memory_layer.location": "file:///home/eve/Documents/volatility3/win-xp-laptop-2005-06-25.img",
"primary.page_map_offset": 233472,
"primary.swap_layers": true,
"primary.swap_layers.number_of_elements": 0,
"kernel.offset": 2152558592,
"kernel.symbol_table_name.class": "volatility3.framework.symbols.windows.WindowsKernelIntermedSymbols",
"kernel.symbol_table_name.isf_url": "file:///home/eve/Documents/volatility3/volatility3/symbols/windows/ntoskrnl.pdb/32962337F0F646388B39535CD8DD70E8-2.json.xz",
"kernel.symbol_table_name.symbol_mask": 0,
"physical": false,
"pid": []
}

I'm sure that @ikelos has in mind a much cleaner and safer way to implement actually correctly finding the translation layer from a modules set of requirements correctly - it'll just take time.

eve-mem commented 1 week ago

Sorry for yet another comment. I realised i forgot to say it's possible to combine the two files and have a single file that works for both the module plugins and the translation layer plugins.

A minor point but might be helpful if you're providing some preprepared config files for example.