rust-vmm / vm-virtio

virtio implementation
Apache License 2.0
364 stars 87 forks source link

virtio-queue: fix alignment check for indirect descriptor #220

Closed zhuangel closed 1 year ago

zhuangel commented 1 year ago

see also: https://gitlab.com/virtio-fs/virtiofsd/-/issues/74

With specific guest kernel config, virtiofsd run panic with cloud-hypervisor, because address of indirect descriptor table is not proper aligned.

After read the spec, there should no restrict on address of indirect descriptor:

2.6.5 The Virtqueue Descriptor Table
...
To increase ring capacity the driver can store a table
of indirect descriptors anywhere in memory, and insert
a descriptor in main virtqueue (with
flags&VIRTQ_DESC_F_INDIRECT on) that refers to memory
buffer containing this indirect descriptor table.

Summary of the PR

Please summarize here why the changes in this PR are needed.

Requirements

Before submitting your PR, please make sure you addressed the following requirements:

rbradford commented 1 year ago

I see you're using a very old kernel (relative to virtio-fs) - 5.4. Perhaps it's just the old kernel that is responsible for this issue. Since the descriptors, etc, are provided by the kernel.

Can you retry with a modern kernel?

zhuangel commented 1 year ago

I see you're using a very old kernel (relative to virtio-fs) - 5.4. Perhaps it's just the old kernel that is responsible for this issue. Since the descriptors, etc, are provided by the kernel.

Can you retry with a modern kernel?

Thanks, @rbradford, I run the same test with guest kernel 5.19, the same problem still exists.

But for 6.0 and 6.1, I got a boot failure like VmBoot(KernelMissingPvhHeader), I just change the SLUB into SLOB base recommend configuration https://github.com/cloud-hypervisor/cloud-hypervisor/blob/v28.0/resources/linux-config-x86_64, and I checked CONFIG_PVH is enabled in config file.

rbradford commented 1 year ago

I see you're using a very old kernel (relative to virtio-fs) - 5.4. Perhaps it's just the old kernel that is responsible for this issue. Since the descriptors, etc, are provided by the kernel. Can you retry with a modern kernel?

Thanks, @rbradford, I run the same test with guest kernel 5.19, the same problem still exists.

But for 6.0 and 6.1, I got a boot failure like VmBoot(KernelMissingPvhHeader), I just change the SLUB into SLOB base recommend configuration https://github.com/cloud-hypervisor/cloud-hypervisor/blob/v28.0/resources/linux-config-x86_64, and I checked CONFIG_PVH is enabled in config file.

You might need to compiler newer kernels like this: CFLAGS="-Wa,-mx86-used-note=no" make bzImage -j `nproc`

zhuangel commented 1 year ago

I see you're using a very old kernel (relative to virtio-fs) - 5.4. Perhaps it's just the old kernel that is responsible for this issue. Since the descriptors, etc, are provided by the kernel. Can you retry with a modern kernel?

Thanks, @rbradford, I run the same test with guest kernel 5.19, the same problem still exists. But for 6.0 and 6.1, I got a boot failure like VmBoot(KernelMissingPvhHeader), I just change the SLUB into SLOB base recommend configuration https://github.com/cloud-hypervisor/cloud-hypervisor/blob/v28.0/resources/linux-config-x86_64, and I checked CONFIG_PVH is enabled in config file.

You might need to compiler newer kernels like this: CFLAGS="-Wa,-mx86-used-note=no" make bzImage -j `nproc`

Thanks for the suggestion, the problem still exists.

zhuangel commented 1 year ago

I see you're using a very old kernel (relative to virtio-fs) - 5.4. Perhaps it's just the old kernel that is responsible for this issue. Since the descriptors, etc, are provided by the kernel. Can you retry with a modern kernel?

Thanks, @rbradford, I run the same test with guest kernel 5.19, the same problem still exists. But for 6.0 and 6.1, I got a boot failure like VmBoot(KernelMissingPvhHeader), I just change the SLUB into SLOB base recommend configuration https://github.com/cloud-hypervisor/cloud-hypervisor/blob/v28.0/resources/linux-config-x86_64, and I checked CONFIG_PVH is enabled in config file.

You might need to compiler newer kernels like this: CFLAGS="-Wa,-mx86-used-note=no" make bzImage -j `nproc`

Thanks for the suggestion, the problem still exists.

After hack the linux-loader code (force change the alignment of elf note section from 8 into 4), I could load v6.1 kernel successfully, I not sure the root cause of this issue, but this should have no relation with current PR, because I reproduce the same issue (virtiofsd panic), after change guest kernel into v6.1.

FYI, the hack code on linux-loader, https://github.com/rust-vmm/linux-loader/blob/v0.6.0/src/loader/x86_64/elf/mod.rs#L346 https://github.com/rust-vmm/linux-loader/blob/v0.6.0/src/loader/x86_64/elf/mod.rs#L347 https://github.com/rust-vmm/linux-loader/blob/v0.6.0/src/loader/x86_64/elf/mod.rs#L381