Proposed enhancement to hivedump

GoogleCodeExporter commented 8 years ago

Hey guys, 

The 1.3 version of hivedump was nice, you could dump an entire hive as text or 
csv and then grep for a certain term or sort by timestamp. In the 1.4 hivedump, 
some of that functionality was lost. I'd like to propose a few changes to r516 
version of hivedump:

1) Allow csv output with timestamps

2) In the text output, change the single-space indention (" ") to something 
like a tab or 4 spaces *and* include the full key name. This way, if you were 
looking for a key named "MaliciousKey" you could pipe the output of hivedump to 
grep and quickly find it, then pass the full key name to printkey. Currently, 
its difficult because hivedump only prints the subkey name on a line by 
itself...so you'd have to redirect all output of hivedump to a file, open it 
up, look for "MaliciousKey" and then try to follow the single-space indention 
to determine the full key before calling printkey -- lots of time.  

So instead of this:

$$$PROTO.HIV
 C07ft5Y
  WinXP
 Classes
  *
   OpenWithList
    Excel.exe

We could have this:

HKEY_LOCAL_MACHINE\Software
    HKEY_LOCAL_MACHINE\Software\C07ft5Y
        HKEY_LOCAL_MACHINE\Software\C07ft5Y\WinXP
    HKEY_LOCAL_MACHINE\Software\Classes
        HKEY_LOCAL_MACHINE\Software\Classes\*

And this:

1229023892,2008-12-11 19:31:32,HKEY_LOCAL_MACHINE\Software
1208454788,2008-04-17 17:53:08,HKEY_LOCAL_MACHINE\Software\C07ft5Y
1208454788,2008-04-17 17:53:08,HKEY_LOCAL_MACHINE\Software\C07ft5Y\WinXP
1229021837,2008-12-11 18:57:17,HKEY_LOCAL_MACHINE\Software\Classes
1220997002,2008-09-09 21:50:02,HKEY_LOCAL_MACHINE\Software\Classes\*

I attached a patched version of hivelist.py that could be used as a template 
for these changes. Note: I moved the HiveDump command from lsadump.py to 
hivelist.py for this example. Also, you'll see the -o option commented out in 
the HiveList command. I don't think that option is needed any longer since you 
guys made hivelist automatically find the physical offset of the first hive 
(which is great, it eliminates the need to run hivescan first - awesome). 

Thanks

Original issue reported on code.google.com by michael.hale@gmail.com on 8 Nov 2010 at 6:21

Attachments:

hivelist.py

GoogleCodeExporter commented 8 years ago

Ok, thanks for the new hivelist, I'll try and take a look at in the next couple 
of days.

It sounds, though, as if you're after two different things, one is more human 
readable output (which is a matter of taste, but I'm happy to go with four 
spaces and full key names).  The second is machine readable output for scripts. 
 I'm not sure how best to deal with the difference between the two.  
Particularly as a third option might be to output fully valid .reg files.  I 
don't know if they can contain timestamps (and to be honest, I'm not sure what 
time the stamps represent).  Moyix, any thoughts/suggestions on the situation?

We'll probably leave in the -o option in case the user ever wants to override 
or for some other reason specify their own offset...

Original comment by mike.auty@gmail.com on 8 Nov 2010 at 7:22

Added labels: Type-Enhancement
Removed labels: Type-Defect

GoogleCodeExporter commented 8 years ago

Yeah, no problem. From a malware perspective, there are two main reasons I'd 
use the registry plugins:

1) Determine if a particular key exists anywhere in a hive 
2) Determine which keys were modified within X number of minutes, hours, or 
days since the time of some other suspicious activity 

So to solve #1, if I know the full path to the key like 
HKEY_LOCAL_MACHINE\Software\Microsoft\MaliciousKey, then I can just call 
printkey directly and see if it exists. However, assuming I'm looking for 
MaliciousKey, but it could be anywhere in the software hive, then I would want 
to use hivedump like this:

$ python volatility.py hivedump -o OFFSET -f mem.dmp | grep MaliciousKey
HKEY_LOCAL_MACHINE\ThisKey\ThatKey\OtherKey\MaliciousKey

To solve #2, I would use the csv output like this:

$ python volatility.py hivedump -o OFFSET -f mem.dmp --output=csv 
--output-file=keys.csv

Then open keys.csv in a spreadsheet and sort by timestamp. The timestamp value 
indicates the last time a value was added to the key, a value was deleted from 
the key, or when any value in the key was modified. So if I know the time when 
a suspicious process started, or any other source, I can easily see if any 
changes were made to the registry around the same time. 

These are just my personal usage scenarios though, I don't know others plan to 
use the plugins!

MHL

Original comment by michael.hale@gmail.com on 8 Nov 2010 at 7:45

GoogleCodeExporter commented 8 years ago

Yeah, ok.  The trick is the output format stuff.  I don't think there is 
--output=csv (although scudette's working on something similar for all 
plugins).  Either way, it sounds as though a human is unlikely to read the raw 
output from dumphive, and possibly the second method (which includes all the 
known information about each key) is the best method, then other scripts/tools 
can be used to reduce it to the necessary bits and pieces.  I'm unlikely to 
include the timestamp twice though, so I'll probably go with the a human 
readable datetime  (as we've done with other listing plugins), and then as long 
as they in a standard output format, a script should be able to deal with them 
appropriately.  Hopefully that sounds a reasonable solution?

Original comment by mike.auty@gmail.com on 8 Nov 2010 at 7:52

GoogleCodeExporter commented 8 years ago

Yeah, to get --output=csv working I just added a HiveDump.render_csv function 
and it was called properly. Sure, going with the human readable datetime is 
fine. For some reason I thought some spreadsheets may have a problem sorting 
the human readable datetimes, so I included the integer value also -- but it 
looks like there's no problem.

Original comment by michael.hale@gmail.com on 8 Nov 2010 at 9:07

GoogleCodeExporter commented 8 years ago

Ok, so there now should be complete paths and last written times output in the 
hivedump plugin.  I haven't included any of the easy_name stuff, since I'm not 
too keen to go hardcoding such path names into the plugin.  I'd be happier 
linking the file the hive was loaded from (and let the user decide which node 
it'll have been loaded as), but I think that would be a bit cumbersome.  I'm 
not fixed on it, but I'd need some more convincing to think showing the 
investigator a guessed name is a good idea...

Hope that's ok?  If so, feel free to close the bug, if not we can leave it open 
for discussion.  5:)

Original comment by mike.auty@gmail.com on 8 Nov 2010 at 10:52

GoogleCodeExporter commented 8 years ago

I'm happy with this, thanks for getting to it so quick, Mike!

Original comment by michael.hale@gmail.com on 9 Nov 2010 at 2:53

GoogleCodeExporter commented 8 years ago

My pleasure.  5:)

Original comment by mike.auty@gmail.com on 9 Nov 2010 at 2:54

Changed state: Fixed

Leor3961 / volatility

Proposed enhancement to hivedump #43