Support for virtio-GPU blob resources

DemiMarie commented 2 weeks ago

virtio-GPU blob resources require a special extension to the vhost-user protocol. @alyssais currently maintains an out-of-tree crosvm patch for this.

stefano-garzarella commented 2 weeks ago

@DemiMarie @alyssais will be nice to merge it here, feel free to open a PR.

alyssais commented 2 weeks ago

What I maintain are patches to allow using crosvm's vhost-user gpu backend, which provides a virtio-gpu device that is compatible with blob devices. This is not the same as the vhost-user-gpu device provided by QEMU, because in crosvm's implementation, the backend provides the display out, and the frontend doesn't need to know anything about displays or Wayland or whatever at all.

What crosvm's gpu-over-vhost-user protocol requires are two messages, VHOST_USER_BACKEND_SHMEM_MAP and VHOST_USER_BACKEND_SHMEM_UNMAP, which allow the backend to request that an fd be mapped into guest memory. These are non-standard messages, by which I mean they're not listed in QEMU's Vhost-user protocol document, and have IDs that are 1000 above the normal range.

I've held off upstreaming these patches, because I expected that they should be standardized first (and at least have permanent IDs assigned). It looks like that discussion got started again recently!

stefano-garzarella commented 2 weeks ago

@alyssais thanks for the details! Yeah, I know @aesteve-rh is working on that, let's put him into the loop ;-)

aesteve-rh commented 2 weeks ago

Yes! My idea is to post the next version of the patch soon. To be honest, I proposed this because I need it for another device, but since a few of them already need it, I aim for a generic implementation that works for all.

Anyway, regardless of how the implementation for Qemu ends up looking, the changes for the vhost-user protocol should remain for the most part.

I have a vhost-user-backend branch that I am using for testing: https://github.com/rust-vmm/vhost/compare/main...aesteve-rh:vhost:mmap-backend-cmd

But I will wait until the Qemu patch is finished before posting.

DemiMarie commented 2 weeks ago

What I maintain are patches to allow using crosvm's vhost-user gpu backend, which provides a virtio-gpu device that is compatible with blob devices. This is not the same as the vhost-user-gpu device provided by QEMU, because in crosvm's implementation, the backend provides the display out, and the frontend doesn't need to know anything about displays or Wayland or whatever at all.

What crosvm's gpu-over-vhost-user protocol requires are two messages, VHOST_USER_BACKEND_SHMEM_MAP and VHOST_USER_BACKEND_SHMEM_UNMAP, which allow the backend to request that an fd be mapped into guest memory. These are non-standard messages, by which I mean they're not listed in QEMU's Vhost-user protocol document, and have IDs that are 1000 above the normal range.

Are these messages only needed for GPU acceleration, or are they also needed for Wayland passthrough without GPU acceleration?

alyssais commented 2 weeks ago

For crosvm-style virtio-gpu over vhost-user, they're needed for Wayland passthrough without GPU acceleration.

DemiMarie commented 2 weeks ago

For crosvm-style virtio-gpu over vhost-user, they're needed for Wayland passthrough without GPU acceleration.

Darn it! Unfortunately, Xen doesn’t support mapping host memory into the guest yet :cry:. The underlying reason is that Xen does not support dealing with page faults from guest memory or revoking guest access to memory that Linux is about to move, so only specially-allocated memory can be mapped into a guest.

DemiMarie commented 2 weeks ago

@alyssais: Without GPU acceleration, are these messages only needed for the keymap? If so, could the keymap be copied instead of mapped? I’d like to not be blocked on Xen and Linux changes.

alyssais commented 2 weeks ago

I'm not an expert, but from looking at its code, it seems that at least wayland-proxy-virtwl needs to allocate host buffers and have them mapped into the guest even for guest→host messages, so I'd assume for things like surfaces: https://github.com/talex5/wayland-proxy-virtwl/blob/1c0cd6d4f13454f0c72148b4c4a1c1e3b728205e/relay.ml#L208-L211

DemiMarie commented 2 weeks ago

That’s unfortunate. It also points to an inefficient design of the protocol, which forces an unecessary and useless copy in the guest for shared memory buffers. A better design would either use “DMA” (have the VMM do the copy) or use udmabuf for zero copy. The advantage of having the VMM do the copy is that the VMM can refuse to modify a mapped buffer, protecting the host compositor from race conditions.

rust-vmm / vhost

Support for virtio-GPU blob resources #245