panda-re / panda

Platform for Architecture-Neutral Dynamic Analysis
https://panda.re
Other
2.47k stars 478 forks source link

PyPanda: How to synchronously make a snapshot? #1047

Open pcworld opened 3 years ago

pcworld commented 3 years ago

I'm having trouble understanding the execution model of PyPanda. I couldn't figure out what the proper way would be to make a synchronous qcow2 snapshot, i.e. snapshotting at a specific point of guest execution.

Panda.snap uses Panda.queue_main_loop_wait_fn to queue functions into the qemu main loop. As far as I understand, as soon as I run panda.run(), I need to queue a callback with panda.queue_blocking to be able to run panda.stop_run(). However, panda.stop_run() appears to stop guest execution without waiting for execution of all queued panda.main_loop_wait_fnargs. Further, a second execution of panda.run() clears panda.main_loop_wait_fnargs.

I have tried the following, and while it works sometimes, at other times it deadlocks in panda.run() (not sure why), and as it looks very unintuitive, I don't think it's an intended solution:

        @self.panda.queue_blocking
        def run_cmd():
            self.panda.snap('mysnapshot')
            self.panda.queue_main_loop_wait_fn(self.panda.queue_async, [self.panda.stop_run])

        self.panda.run()
AndrewFasano commented 3 years ago

It seems like you're finding all the dragons hiding in the PyPANDA code lately! This is a bad design, but it's the best we could come up with given how much effort we were willing to put in to it previously (There might be a better way and we definitely know more about this now than we did when we made that over a year ago).

The entire purpose of the main_loop_wait queue is to run internal qemu functions at a "safe" time where they don't cause deadlocks. Since QEMU's multithreaded, it's not always safe/allowed to call functions from the CPU emulation thread where our callbacks are (usually) triggered from. Originally, we implemented the panda.snap function by directly calling panda_snap in the panda API, but we found that it was causing deadlocks so we moved it to this queued design. @lacraig2 or @tleek might remember more details.

If you're interested in helping us redesign this so we can take a snapshot in the middle of a block it would be a huge improvement and we'd be happy to help.

Also- are you in our slack? We'd be happy to chat about this more there (or on here is fine too!)- https://panda.re/invite.php

mariusmue commented 3 years ago

I'm not exactly sure, but I think I may have run into some issues with this as well in the past. If I recall correctly, my workaround was to stop the VM, and issue a "begin_record" request via qmp, which would create a PANDA compatible snapshot. Then, I would end the record and restore it.

However, this was all from the avatar2-based programming model, where I have separate threads outside of the PyPanda object anyhow.

github-actions[bot] commented 2 years ago

Stale issue message

pcworld commented 2 years ago

Still relevant; I now use as a workaround: panda.run_monitor_cmd('savevm snapshot_name')

(I've joined your Slack but don't have much time at the moment to dig even deeper into Panda; maybe later.)

AndrewFasano commented 2 years ago

Not a fix, but in case it helps - I've been using the following to take a snapshot at a precise location (i.e., in a callback triggered by a hook) and it seems to work:

charptr = panda.ffi.new("char[]", snap_name.encode())
panda.vm_stop()
panda.queue_main_loop_wait_fn(panda.libpanda.panda_snap, [charptr])
panda.queue_main_loop_wait_fn(panda.libpanda.panda_cont)
github-actions[bot] commented 2 years ago

This issue has gone stale! If you believe it is still a problem, please comment on this issue or it will be closed in 30 days

pcworld commented 2 years ago

yes