Closed tmakatos closed 3 years ago
I've done some work for supporting live migration in libmuser, I've over simplified most of the code to make it work. There are many open issues, most importantly understanding how some regisiters in the migration region are supposed to work (more information in the code as FIXME and TODO).
Also, check VFIO live migration in QEMU, it might answer lots of questions regarding how registers are used etc.
I see that in hw/vfio/migration.c, as soon as the device transitions to the stop-and-copy phase (before migration data are actually copied), the device configuration state is stored. IIUC the client is responsible for saving/loading it, so no need to do anything in muser.
I received some clarifications on some question I have (https://marc.info/?l=kvm&m=160502788221980&w=2):
data_offset
marks a new iteration, so re-reading pending_bytes
and data_size
is allowed.pending_bytes
register is volatile.data_size
out of sequence if undefined.data_offset
and data_size
are invariant during an iteration.Fixed.
Implementing live migration requires providing a migration region with a VFIO region info type capability. Currently we only support the sparse mmap capablility so we don't require the user to create a VFIO capability header and chain with the sparse mmap capability; we simply receive sparse areas via
lm_reg_info_t.mmap_areas
during context creation time. This made sense in the past as there was no other VFIO region capability that we had to support, however this now needs to change. Rather than adding a new member tolm_reg_info_t
for passing the VFIO region info type capability (along withmmap_areas
), we should refactor the code and take a generic VFIO region capability, just like we do for the PCI capabilities (lm_cap_t
). This way we'll instantly support new VFIO region capability that might be added in the future.So first, we need to make this change. Second, we need to extend the client/server samples to use this code. Third, we can continue working on the actual live migration.