Leor3961 / volatility

Automatically exported from code.google.com/p/volatility
0 stars 0 forks source link

Caching framework doesn't invalidate cache nodes on changed config options #13

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
Sorry to keep pestering you with these, but I figure it's best to have a 
placeholder so that we can discuss them.

At the moment, running memdump with -p 1 followed by memdump -p 2 will fail 
because memdump caches the output of dlllist, which is filtered based on 
config.PID.

The solution that comes to mind immediately is to cache the config object 
entirely, and then verify that all they are the same.  If not, then invalidate 
the cache.  We'll probably have to add in a function that returns all the 
(relevant/cache-changing) options to make it easy to store and later verify.  
Then we'll have to assume that all options are cache-affecting by default, and 
go around turning off those that aren't.

This might sidestep the need to hand plugins their own config objects (at least 
for caching, not for use as a library), however it has the downside that a 
config item which isn't used will still invalidate the cache.  Since each 
plugin only sees it's own options (-p on a plugin that doesn't support it will 
raise a help message), this may not be a problem.

Original issue reported on code.google.com by mike.auty@gmail.com on 23 Aug 2010 at 11:38

GoogleCodeExporter commented 8 years ago

Original comment by mike.auty@gmail.com on 26 Aug 2010 at 12:24

GoogleCodeExporter commented 8 years ago
Swapping owner and CC, so I know which tasks I need to work on...

Original comment by mike.auty@gmail.com on 26 Aug 2010 at 11:28

GoogleCodeExporter commented 8 years ago
Issue 21 has been merged into this issue.

Original comment by mike.auty@gmail.com on 27 Aug 2010 at 9:13

GoogleCodeExporter commented 8 years ago
Issue 23 has been merged into this issue.

Original comment by mike.auty@gmail.com on 27 Aug 2010 at 9:18

GoogleCodeExporter commented 8 years ago

Original comment by mike.auty@gmail.com on 2 Sep 2010 at 12:47

GoogleCodeExporter commented 8 years ago
Ok the issue is that the path under which the cache is keyed does not contain a 
PID while the contents of the decorated function uses PID as a parameter. This 
is now fixed by making the path:

    @CacheDecorator(lambda self: "test/memmap/pid%s" % self.config.PID)
    def calculate(self):

The lambda callable will be called in the same way as the calculate function 
(i.e. its passed self in this case). I also put the config object in the 
command class to make it easier to framework the whole thing.

So this solution is elegant because we are not really invalidating the cache 
when we detect a new PID, we actually create a new cache - which will be used 
next time we ask for that PID again. The idea is to make the cache key path an 
invariant (i.e. all external changes which could influence the result should be 
encoded into it).

There is still another problem however, the current cache detects that there is 
a generator within a generator and raises ContainsGenerator which prevents this 
from being cached at all. This issue should be fixed as a different bug.

Original comment by scude...@gmail.com on 2 Sep 2010 at 3:06

GoogleCodeExporter commented 8 years ago
Reopening this since this is supposed to be about more than just a single 
plugin.  This issue involves invalidation of any config change (such as the 
dtb, use of the old address spaces, etc).  This bug's only really high priority 
if we want caching to work without bugs for 1.4, I think if we push out a 
(potentially) buggy caching implementation now, it will scare people off it in 
future versions, so best to get it right even if caching isn't enabled by 
default.

Original comment by mike.auty@gmail.com on 2 Sep 2010 at 4:00

GoogleCodeExporter commented 8 years ago
Ok there is a scenario where caching fails:

volatility.py ident to recognise the dtb (say its 0x3900)

volatility.py pslist --dtb 0x3900  : This will cache the pslist
volatility.py pslist --dtb 0x3901  : This should fail but it just returns the 
cached version.

Original comment by scude...@gmail.com on 2 Sep 2010 at 4:13

GoogleCodeExporter commented 8 years ago
Issue 29 has been merged into this issue.

Original comment by mike.auty@gmail.com on 10 Sep 2010 at 12:24

GoogleCodeExporter commented 8 years ago
Unfortunately this will require tuning individual plugins to be sensitive about 
the config option they themselves need. There is no way for the cache to 
magically know which option is invariant (i.e. produces the same results in the 
cache) and which is not (i.e. we need to recalculate the cached value).

The dtb issue is now fixed in experimental.

We need to make sure that go through the config options and think about which 
are sensitive and then fix the required modules.

This particular issue should be closed now.

Original comment by scude...@gmail.com on 10 Sep 2010 at 7:14

GoogleCodeExporter commented 8 years ago
As I mentioned on IRC, I don't think that requiring the modules be written with 
extreme care to ensure caching issues is the right way to go.  External modules 
aren't under our control and not all developers read documentation so then 
we'll have a lot of bugs in other people's modules to try and fix.

Perhaps we could add a flag to config options, indicating that they will not 
change cached results (such as output flags, --help, and so on) and then have 
that off by default for new config options.  Developers that take the time to 
read it will know which ones they can safely turn off, and those that don't 
will get their results invalidated, which will still give the correct answer.  
That way the entire config can be stored for each cachenode, and just the 
important config options verified at restore point.

There'd still be problems with global config options that can have an effect on 
some plugins, but not on others (forced dtb shouldn't affect physical-as 
scanning plugins, for instance), but as I said, I'd sooner invalidate and be 
wrong, than not-invalidate and be wrong.  5:)

I still believe there's an issue with cache nodes dependent upon each other 
where one's invalidated and the other one doesn't know anything about it, but 
at the very least this would eliminate the biggest problem.

Does that sound like a good plan?  I can mock up a demo if it sounds 
reasonable...

Original comment by mike.auty@gmail.com on 10 Sep 2010 at 10:40

GoogleCodeExporter commented 8 years ago
Ok, r430 is a check-in of a version of the cache invalidation framework that 
scudette developed (thanks scudette!).  Minor changes included removing 
cache_invalidation flags when removing an option.  Also made sure that the 
invalidator is copied when testsuite sets the tree caching the second time 
around.

This should clean up the majority of the circumstances in which bad cache data 
could be handed back to the user, so therefore marking this issue as fixed.

Original comment by mike.auty@gmail.com on 11 Sep 2010 at 1:31

GoogleCodeExporter commented 8 years ago
Sorry for the bugspam, but better to get this right now than later once it's 
more in use.

Original comment by mike.auty@gmail.com on 4 Feb 2011 at 9:34