panda-re / panda

Platform for Architecture-Neutral Dynamic Analysis
https://panda.re
Other
2.5k stars 479 forks source link

Move to TCG-Plugin based model #1383

Open AndrewFasano opened 1 year ago

AndrewFasano commented 1 year ago

As of v4.2.0, QEMU upstream supports "TCG Plugins" which can analyze the execution of a guest as it runs. We've created a description of how these compare to PANDA plugins on our wiki here.

TCG Plugins seem like a promising direction for us to implement PANDA-like analyses (with the exception of record/replay) in a way that will help us stay up to date with upstream QEMU. Changes we make towards this goal could likely be merged into upstream QEMU as well as some of our plugins. I suspect upstream QEMU won't be interested in many of our plugins (since they're large, complex, and unrelated to building a good emulator), but the API changes we might make towards this goal and some of the example plugins would hopefully be something they'd take.

If we were to re-implement parts of PANDA atop this model, those parts would be able to stay up to date with upstream easily, significantly reducing our maintenance burden, and we'd also hopefully be able to contribute something of value back to the upstream project that has given us the emulation backbone of our project.

We've started this effort in the panda-re/qemu repo, but it's currently an initial effort just demonstrating viability. We've created a port of the PPP system for inter-plugin interactions (renamed to QPP), ported a few plugins: syscalls, stringsearch, syscalls_logger, and OSI / OSI Linux.

There are a number of blocking issues with merging this into upstream, in particular missing APIs for reading and modifying guest state. We've speculated that upstream qemu is likely to be opposed to plugins that modify guest state (in their, non-analysis, use case these would probably just introduce instability and bugs). But being able to read guest registers/memory seems like a reasonable feature.

Here's a rough roadmap we see with some notes for each point:

Not all of these need to happen in upstream - we could maintain a fork that expands the plugin APIs and then build our plugins atop these expanded APIs if necessary. That model would beat our current model, but the best case would be to get as much as we can merged with upstream.

* The patch we went to upstream got some feedback, but we ended up blocked on the concern that was roughly "You need to also give us a plugin that does something with these changes" which is difficult while there's no way for plugins to read/write any guest state. If the plugin API is expanded to allow plugins to read guest memory, we can include some plugins with that PR such as stringsearch or OSI linux.

XVilka commented 1 year ago

@AndrewFasano the latest patch is: https://lore.kernel.org/qemu-devel/20231103195956.1998255-1-alex.bennee@linaro.org/T/#ma5a045817c5527ae396f8b16c3b7d710dc5e274f

As for memory access, there is no patch so far yet: https://gitlab.com/qemu-project/qemu/-/issues/1719

XVilka commented 9 months ago

@AndrewFasano API for reading registers was finally merged: https://gitlab.com/qemu-project/qemu/-/commit/8df5e27cf71c727a3e1bc9172819ec69eca32ff4

See https://gitlab.com/qemu-project/qemu/-/commit/af6e4e0a22c18a7cc97650caec56ed99c9899dd7 on how to use the new API within contrib/plugins/execlog.c.

AndrewFasano commented 9 months ago

Thanks for the update, that's very exciting to hear! Will definitely help us get things moving with this upgrade.

stsquad commented 2 months ago

I've pulled the memory tracing APIs into https://gitlab.com/stsquad/qemu/-/tree/plugins/next?ref_type=heads and should be posting to the list in the next week or so.

stsquad commented 2 months ago

Please see https://patchew.org/QEMU/20240910140733.4007719-1-alex.bennee@linaro.org/ for the current status.

AndrewFasano commented 2 months ago

Amazing, thanks for the update @stsquad! We'll work to update our plugins and "QPP" (inter-plugin interaction) system updated on top of these changes! Hopefully we'll have some patches to share with the mailing list soon

AndrewFasano commented 2 months ago

I updated our old patch series and some experimental porting work we had done previously atop these changes. With your updates @stsquad, it seems like we'll be able to move a bunch of our analyses over to qemu without too much effort 🎉

I'll wait on your patch series to get merged and then I think we can start sending some of these changes to the qemu mailing list to get feedback! Our VMI system is massive and complicated so I'm not sure it will ever be worth pulling into upstream, but hopefully the API changes and other examples can make the cut!

Placeholder github PR is here showing how we can add virtual machine introspection and syscall logging which provides some ugly output like the following. Future work might explore integrating our syscall tables to map names and argument types for various architectures into this plugin so we can pretty-print the syscalls.

Process ['systemd', pid 1, ppid 0, asid 3dadc000] PC 7f56fbb62c8c: syscall 101(ffffff9c, 7ffde7d94560, 80000, 0, 0, 7f56fb76043a)
Process ['systemd', pid 1, ppid 0, asid 3dadc000] PC 7f56fbb627c1: syscall 5(8, 7ffde7d94400, 7ffde7d94400, 0, b40, 7ffde7d945a7)
Process ['systemd', pid 1, ppid 0, asid 3dadc000] PC 7f56fbb6307f: syscall 0(8, 55a347e2b540, 400, 0, b40, 7ffde7d945a7)
stsquad commented 2 months ago

The memory API changes are now merged and will be available in 9.2