Open pcworld opened 3 years ago
It seems like you're finding all the dragons hiding in the PyPANDA code lately! This is a bad design, but it's the best we could come up with given how much effort we were willing to put in to it previously (There might be a better way and we definitely know more about this now than we did when we made that over a year ago).
The entire purpose of the main_loop_wait queue is to run internal qemu functions at a "safe" time where they don't cause deadlocks. Since QEMU's multithreaded, it's not always safe/allowed to call functions from the CPU emulation thread where our callbacks are (usually) triggered from. Originally, we implemented the panda.snap
function by directly calling panda_snap in the panda API, but we found that it was causing deadlocks so we moved it to this queued design. @lacraig2 or @tleek might remember more details.
If you're interested in helping us redesign this so we can take a snapshot in the middle of a block it would be a huge improvement and we'd be happy to help.
Also- are you in our slack? We'd be happy to chat about this more there (or on here is fine too!)- https://panda.re/invite.php
I'm not exactly sure, but I think I may have run into some issues with this as well in the past. If I recall correctly, my workaround was to stop the VM, and issue a "begin_record" request via qmp, which would create a PANDA compatible snapshot. Then, I would end the record and restore it.
However, this was all from the avatar2-based programming model, where I have separate threads outside of the PyPanda object anyhow.
Stale issue message
Still relevant; I now use as a workaround: panda.run_monitor_cmd('savevm snapshot_name')
(I've joined your Slack but don't have much time at the moment to dig even deeper into Panda; maybe later.)
Not a fix, but in case it helps - I've been using the following to take a snapshot at a precise location (i.e., in a callback triggered by a hook) and it seems to work:
charptr = panda.ffi.new("char[]", snap_name.encode())
panda.vm_stop()
panda.queue_main_loop_wait_fn(panda.libpanda.panda_snap, [charptr])
panda.queue_main_loop_wait_fn(panda.libpanda.panda_cont)
This issue has gone stale! If you believe it is still a problem, please comment on this issue or it will be closed in 30 days
yes
I'm having trouble understanding the execution model of PyPanda. I couldn't figure out what the proper way would be to make a synchronous qcow2 snapshot, i.e. snapshotting at a specific point of guest execution.
Panda.snap
usesPanda.queue_main_loop_wait_fn
to queue functions into the qemu main loop. As far as I understand, as soon as I runpanda.run()
, I need to queue a callback withpanda.queue_blocking
to be able to runpanda.stop_run()
. However,panda.stop_run()
appears to stop guest execution without waiting for execution of all queuedpanda.main_loop_wait_fnargs
. Further, a second execution ofpanda.run()
clearspanda.main_loop_wait_fnargs
.I have tried the following, and while it works sometimes, at other times it deadlocks in
panda.run()
(not sure why), and as it looks very unintuitive, I don't think it's an intended solution: