Open CuriousGeorgiy opened 1 year ago
We think this is more focused on firecracker-containerd and may be able to get better help there, so it may make sense to transfer this issue to the firecracker-containerd repository. Do you see an issue with this?
@JonathanWoollett-Light
I agree, but at the same time I would like to emphasize that, AFAIC, there may be not enough firecracker capabilities to solve this problem right now.
BTW, just to make it clearer, my current hacky solution looks like this:
diff --git a/src/api_server/swagger/firecracker.yaml b/src/api_server/swagger/firecracker.yaml
index 930017e8..0393ab44 100644
--- a/src/api_server/swagger/firecracker.yaml
+++ b/src/api_server/swagger/firecracker.yaml
@@ -1093,6 +1093,7 @@ definitions:
required:
- mem_file_path
- snapshot_path
+ - disk_device_path
properties:
enable_diff_snapshots:
type: boolean
@@ -1108,6 +1109,10 @@ definitions:
type: boolean
description:
When set to true, the vm is also resumed if the snapshot load is successful.
+ disk_device_path:
+ type: string
+ description:
+ Path to the disk device backing the disk state at the time of the snapshot creation.
TokenBucket:
type: object
diff --git a/src/devices/src/virtio/block/persist.rs b/src/devices/src/virtio/block/persist.rs
index 3d72bf97..564bc839 100644
--- a/src/devices/src/virtio/block/persist.rs
+++ b/src/devices/src/virtio/block/persist.rs
@@ -91,7 +91,7 @@ pub struct BlockState {
)]
cache_type: CacheTypeState,
root_device: bool,
- disk_path: String,
+ pub disk_path: String,
virtio_state: VirtioDeviceState,
rate_limiter_state: RateLimiterState,
#[version(start = 3)]
diff --git a/src/vmm/src/persist.rs b/src/vmm/src/persist.rs
index 26317225..baf54ca1 100644
--- a/src/vmm/src/persist.rs
+++ b/src/vmm/src/persist.rs
@@ -442,7 +442,21 @@ pub fn restore_from_snapshot(
) -> std::result::Result<Arc<Mutex<Vmm>>, LoadSnapshotError> {
use self::LoadSnapshotError::*;
let track_dirty_pages = params.enable_diff_snapshots;
- let microvm_state = snapshot_state_from_file(¶ms.snapshot_path, version_map)?;
+ let mut microvm_state = snapshot_state_from_file(¶ms.snapshot_path, version_map)?;
+
+ let disk_device_path = ¶ms.disk_device_path;
+ if ! disk_device_path.eq("") {
+ // We assume that each microVM is backed by exactly one container image
+ // snapshot device (i.e., that no more than one container is run on each microVM).
+ assert_eq!(microvm_state.device_states.block_devices.len(), 2);
+ for i in 0..2 {
+ // We assume that one of the block devices is the rootfs, the other being the
+ // container image snapshot.
+ if microvm_state.device_states.block_devices[i].device_state.disk_path.contains("snap") {
+ microvm_state.device_states.block_devices[i].device_state.disk_path = disk_device_path.clone();
+ }
+ }
+ }
// Some sanity checks before building the microvm.
snapshot_state_sanity_check(µvm_state)?;
diff --git a/src/vmm/src/vmm_config/snapshot.rs b/src/vmm/src/vmm_config/snapshot.rs
index 9662de08..83eda2be 100644
--- a/src/vmm/src/vmm_config/snapshot.rs
+++ b/src/vmm/src/vmm_config/snapshot.rs
@@ -56,6 +56,13 @@ pub struct LoadSnapshotParams {
/// is successful.
#[serde(default)]
pub resume_vm: bool,
+ /// Path to the disk device backing the disk state at the time
+ /// of the snapshot creation.
+ ///
+ /// Since the disk device path is not deterministic (it depends on the
+ /// containerd devmapper snapshotter implementation), we need to manually
+ /// patch the microVM state block device path backing the container snapshot.
+ pub disk_device_path: String,
}
/// The microVM state options.
Hey @CuriousGeorgiy? Is this still an issue?
I think you found out that you can patch the drivers after loading a snapshot and before resuming the microVM (not considering the issues related to #4036).
Is it ok if I close this?
Hey @bchalios
I think you found out that you can patch the drivers after loading a snapshot and before resuming the microVM (not considering the issues related to https://github.com/firecracker-microvm/firecracker/issues/4036).
In order for this to work, it would require a hustle with unmounting drives before pausing the VM and then mounting them back after resuming the VM (and I am not sure it is actually achievable, since, AFAIC, unmounting can fail, for instance, because of open files), I would prefer going for a "resource renaming" approach as @roypat mentioned in his comment.
So, the resource renaming approach is what the PATH /drive
API does. Unless @roypat referred to something else, I 'm missing?
@bchalios
It's actually even simpler: AFAIC, what you suggest will only work after boot (i.e., after snapshot loading) — but loading from a snapshot still requires the block devices to be available at the snapshot creation time path.
Technically, this could be solved by creating stub devices for snapshot loading, but again that only adds to the mount/unmount hustle.
Oh, I see now. You and Patrick are right. As Patrick said, this is a limitation the snapshot API has at the moment. FYI not just for block devices, net and vsock are the same. So, we will need to work to solve the problem for all of these. Will keep you posted.
Opened a draft PR #4072 to illustrate what I am trying to achieve.
Feature Request
I am hacking on firecracker-containerd to support firecracker snapshots, and I am facing the following problems:
Firstly, the addVsocksHandler expectedly fails with the following error:
If I comment this handler out, everything works okay and firecracker-containerd is able to connect to the agent. Seems like this handler is redundant, and should be simply removed (I can bring a patch to firecracker-go-sdk).
Secondly, I am trying to figure out a way to restore the disk backing files. The problem I am facing is that container image snapshot device paths are not deterministic (their assignment depend on the containerd snapshotter implementation). To overcome this, I tried to use the drive mounts feature of firecracker-containerd and mount the container snapshot device for the the snapshot reloaded VM to the path that the original device had, but it doesn’t work, since drives cannot be configured^1 for snapshot booted VMs.
Describe the desired solution
Do you have any ideas as to how it would be possible to develop the approach with drive mounts?
Describe possible alternatives
Right now I am manually patching the block device path corresponding to container snapshot during micro VM state to a path provided from the LoadSnapshot request (which is obviously unfavorable).
Checks