Closed atcuno closed 2 months ago
@ikelos I am working on all the args/returns docs for this. Is there a standard line in an existing plugin for the often repeated arguments (context, config, config_path, etc.). I would like to use the standard one and seems like all plugins should.
@ikelos the new list requirement form is really cool
$ python3 vol.py -f /mnt/samples/Sample1.lime windows.pe_symbols --source processes --module ntdll.dll --symbols NtOpenThread NtCreateFile
Volatility 3 Framework 2.9.0
Module Symbol Address
ntdll.dll NtOpenThread 0x7c90d640
ntdll.dll NtCreateFile 0x7c90d090
$ python3 vol.py -f /mnt/samples/Sample1.lime windows.pe_symbols --source processes --module ntdll.dll --addresses 0x7c90d640 0x7c90d090
Volatility 3 Framework 2.9.0
Module Symbol Address
ntdll.dll NtOpenThread 0x7c90d640
ntdll.dll NtCreateFile 0x7c90d090
@ikelos this is ready again. I left a few conversations unresolved as I either had questions for you to answer or for you to see then resolve. Also, this needs to be prioritized still as every remaining Windows plugin for parity is using this API.
@ikelos I am working on all the args/returns docs for this. Is there a standard line in an existing plugin for the often repeated arguments (context, config, config_path, etc.). I would like to use the standard one and seems like all plugins should.
Seems to be "The context to retrieve required elements (layers, symbol tables) from" for most linux/mac plugins...
Ok, think I've caught most of the responses. I've reopened some because modifying input parameters rather than creating new output parameters really makes my skin crawl. The docstrings saying that the value gets changed do help, but it kinda in little letters at the end rather than something obvious to spot. I'm still not keen on it and I think I need to re-read it again to see where the generator is used and then whether people are expect to interact with it. I think I'd still prefer the generator yielding it as intermediate state (so each found module is part of a tuple containing the list of missed ones, or all those not-yet-resolved, or just something). I'm still looking for a happy compromise we can both agree on, but I'm not yet accepting that there's just a changing variable which you have to know will be altered as you're going around...
Ok, I figured out why I don't like it so much! Almost all the rest of python (not just our code) either returns or mutates, but not both. That's why it's so jarring. That's what my problem with it is (moreso, being in the middle of a generator, it means that you might pass the generator 8 calls away, and then start iterating it, and it will suddenly start changing another variable). I really, really don't like it. There must be another way of achieving the same goal? https://stackoverflow.com/questions/26027694/correct-style-for-python-functions-that-mutate-the-argument
Otherwise it's looking pretty tidy, I think that's the last bit to get sorted...
Ok, think I've caught most of the responses. I've reopened some because modifying input parameters rather than creating new output parameters really makes my skin crawl. The docstrings saying that the value gets changed do help, but it kinda in little letters at the end rather than something obvious to spot. I'm still not keen on it and I think I need to re-read it again to see where the generator is used and then whether people are expect to interact with it. I think I'd still prefer the generator yielding it as intermediate state (so each found module is part of a tuple containing the list of missed ones, or all those not-yet-resolved, or just something). I'm still looking for a happy compromise we can both agree on, but I'm not yet accepting that there's just a changing variable which you have to know will be altered as you're going around...
Ok, I figured out why I don't like it so much! Almost all the rest of python (not just our code) either returns or mutates, but not both. That's why it's so jarring. That's what my problem with it is (moreso, being in the middle of a generator, it means that you might pass the generator 8 calls away, and then start iterating it, and it will suddenly start changing another variable). I really, really don't like it. There must be another way of achieving the same goal? https://stackoverflow.com/questions/26027694/correct-style-for-python-functions-that-mutate-the-argument
Otherwise it's looking pretty tidy, I think that's the last bit to get sorted...
I put thoughts on this here:
https://github.com/volatilityfoundation/volatility3/pull/1250#discussion_r1752763561
My propsoal then is to copy.deepcopy() the parameter so we can remove from the copy, this will keep the parameter in tact. The output parameter (found_symbols) is created from scratch and not a mutation of the input parameter. Does this work? We need to be able to track when we parsed all of the symbols for reasons in my linked comment.
Yep, that sounds ideal. It can be deepcopied (can probably be normal copied since all we really care is that the original list doesn't change and if they're Volatility objects it might be quite slow?) and then updated and fed out on each generator call.
When you say the output parameter of found symbols, I imagined you'd return a tuple of found_symbols, remaining
and then remaining would just be the updated copy of the input symbols but with the found ones removed?
When you say the output parameter of found symbols, I imagined you'd return a tuple of
found_symbols, remaining
and then remaining would just be the updated copy of the input symbols but with the found ones removed?
It doesn't do this now, but I can add that in easily.
@ikelos I got it working in the way you wanted - the copy of the input dictionary being changed only in the same function, no more generator internally, and the unresolved (missing symbols) returned back to the caller. Black/mypy/pylint all passed plus all plugin forms retested. I think we are good now :)
Much happier with that thanks. Hopefully the deep copy isn't too slow, you might be better off just repopulating the lists, but we can worry about efficiency if it proves to be a problem...
The naming of the types is still a little irk, but not a showstopper by any means.
Last the remaining is the big if statements being replaced by or NotAvailableValue()
... I've checked through unhooked_system_calls
and that looks good, so... nearly there!
Much happier with that thanks. Hopefully the deep copy isn't too slow, you might be better off just repopulating the lists, but we can worry about efficiency if it proves to be a problem...
The naming of the types is still a little irk, but not a showstopper by any means.
Last the remaining is the big if statements being replaced by
or NotAvailableValue()
... I've checked throughunhooked_system_calls
and that looks good, so... nearly there!
This should all be done now. The returning None and using "or ..." is here:
@ikelos everything should be addressed now.
… plugin
This is the final form of the PE symbols plugin and API. The API is very flexible and supports plugins calling it with any number of modules (DLLs or kernel drivers) and resolving symbol names to addresses or doing a reverse lookup for addresses to symbols. A built in cache makes it to where repeated calls for the same data do not require any further searching. The API is also made to generically support new methods of symbol resolution through new implementations of PESymbolFinder, for which there will be more in the future. The API is extremely powerful as, instead of how existing plugins only look for PDBs and/or exports in their own process, the API uses every process to attempt to get the symbol information - so even if an infected process has the export table or PDB page smeared, some other process will still hold the data that is needed. The same concept happens in the kernel by walking every session layer.
To show the API in use in both directions (names->address and address->name), I have included two plugins from DEFCON that use the final API form.
Example output of it in an infected sample:
The processes with distinct implementations listed are hooked by the malware.
The second plugin, debug registers, uses the reverse lookup to map an address that is being monitored to its symbol. Note that the plugin can automatically determine that AmsiScanBuffer is being manipulated by this malware.
There are many more plugins that will be using this API once its merged, but I included two only now to show the API in real plugin usage.
All plugins pass:
I also updated vadinfo.py to fully pass these checks along with the needed annotations.