Closed garanews closed 1 year ago
Hiya, we've discussed this a bit online, but here are some documented answers to your points:
Files that aren't older than the cache expiry time aren't reread. We don't currently check or store the timestamps on the files, just when they were cached. https://github.com/volatilityfoundation/volatility3/blob/develop/volatility3/framework/automagic/symbol_cache.py#L262 As such there's scope here to either stash the timestamp (we'd need to figure out what that would mean for remote locations and/or files inside of zip files) or to hash the file and check whether it changed (if this is compressed inside a container, this still could be very slow).
It would be possible, but a new class derived from the CacheManagerInterface
would need to be written, and there'd need to be some code to allow the choice of cache manager and a way to establish the initial connection.
This is ongoing at #754, please follow along there.
The first point should have been greatly improved by #858, which now checks file timestamps to see whether recaching is necessary. Assuming the datetime stamps on your local files are accurate, this should not require recaching unless a file is modified after the cache expiry timeout has been reached.
This issue is stale because it has been open for 200 days with no activity.
This issue was closed because it has been inactive for 60 days since being marked as stale.
Recently Vol3 added a new cache mechanism: https://github.com/volatilityfoundation/volatility3/blob/develop/volatility3/framework/automagic/symbol_cache.py At first run of Vol3 the symbols will be read and data will be stored in a SQLite DB. Initially the default value to re-generate the cache was set to 3 days: now raised to 1 month. On my environment the build of cache (1.2MB) takes 10 minutes to parse 3000 windows symbols file (around 800MB).
It would be possible to evaluate a mechanism to don't generate the cache if files have not changed? I am wondering like calculate the MD5 of the files and do a lookup in a DB. On my system the md5sum of 3000 files takes 1.4 secs then need to add the lookup time.
it would be possible to store the cache into another database such as PostgreSQL ? This can help to keep aligned cache running Vol3 on different containers
This is more in general: it would be possible to let Vol3 read from a config file this kind of variables ( located volatility3/framework/constants/init.py ) ? SYMBOL_BASEPATHS PLUGINS_PATH SQLITE_CACHE_PERIOD CACHE_PATH REMOTE_ISF_URL Also this can help in an environment with multiple instances (servers) running Vol3