panda-re / panda

Platform for Architecture-Neutral Dynamic Analysis
https://panda.re
Other
2.5k stars 480 forks source link

Fix unloading of C plugins #1533

Open lluchs opened 2 months ago

lluchs commented 2 months ago

https://github.com/panda-re/panda/commit/76b5d84a9155f95c69cfc03dade634a990aa83ea removed the direct call to self.libpanda.panda_unload_plugins(), introduced by https://github.com/panda-re/panda/commit/bb83a2f6d270301542243c05d00db850796e3771. As noted in the comment in unload_plugins() in that commit, that function doesn't unload C plugins during shutdown, so the direct call is still required.

(Or at least I assume that is what is happening - adding the call back fixes uninit calls for me anyways.)

AndrewFasano commented 2 months ago

I think this might be a bit more complicated than it seems at first. Let me first walk through what we want here (mostly to have this written down for myself, seems like you might already understand this):

When PANDA goes to terminate an emulation because

We want to inform all the loaded plugins of the imminent shutdown by:

And finally clean up/unload plugin state, terminate the emulation, and exit.

In #1313 (76b5d84a9155f95c69cfc03dade634a990aa83ea) we fixed the case where a shutdown triggered from C/C++ wouldn't cause the unint method of pyplugins to run.

It sounds like the issue you're seeing is that the uninit_plugin method in C plugins isn't being run if a Python plugin shuts the emulator down - is that correct?

I'm worried that there might be some race conditions or issues with the change you have here because the existing unload_plugins method already calls libpanda.panda_unload_plugins - it queues up logic to unload pyplugins and c plugins at the next main_loop_wait. The main_loop_wait method exists to avoid deadlocks by scheduling certain work to occur in a time that we know to be (somewhat) "safe". If we unload plugins at other times I think we have a risk of unloading code while it might still be called and this could cause crashes during shutdown.

https://github.com/panda-re/panda/blob/3787038343bb1e7fa39ae543f85ebab699fd2a9f/panda/python/core/pandare/panda.py#L781-L785

But there's a comment specifically about this method being used during shutdown:

    XXX: If called during shutdown/exit, c plugins won't be unloaded
   because the next main_loop_wait will never happen. Instead, call
   panda.panda_finish directly (which is done at the end of panda.run())

Tracking that down, I see the end of the python logic driving emulation calls panda.panda_finish which ultimately returns execution to shutdown logic in vl.c here - but I don't see anything unloading plugins on this path - perhaps we should add this unload logic there instead?

https://github.com/panda-re/panda/blob/3787038343bb1e7fa39ae543f85ebab699fd2a9f/vl.c#L5088-L5115

lluchs commented 2 months ago

Thanks for the quick and detailed response! I read a bit more source code and now I'm less sure how everything is supposed to work :)

My Python script loads a Rust plugin and calls panda.run() and panda.stop_run() a couple times. It's important that uninit runs in the Rust plugin so that it flushes its output.

If I understand what is happening correctly, the call to stop_run() unloads the plugins via what happens in run(). It's not clear to me if that is what is supposed to happen - the Python API documentation says that end_analysis() will unload the plugins, but stop_run() will not.

Your idea with unloading the plugins in vl.c is actually what is already happening. It's just hidden away in the call to panda_cleanup(): https://github.com/panda-re/panda/blob/3787038343bb1e7fa39ae543f85ebab699fd2a9f/panda/src/common.c#L229-L235

However, that code actually never runs if you don't call panda_finish() yourself. The comment cited above implies that such a call is supposed to be at the end of panda.run(), but that is not the case. (I didn't check whether it used to be there.)

So now I'm not sure what to do here. My code now works after adding a call to panda_finish(). So possibly this is more a documentation issue and all the examples should have such a call (rather than just calling end_analysis() or stop_run()). Or end_analysis() should somehow call panda_finish() (but probably not stop_run(), since that is actually not supposed to unload plugins?).