Closed erichchan999 closed 2 weeks ago
The gpu example in this PR does not work on macos QEMU due to the udmabuf framework requirement, which only exists on linux.
How does the LionsOS Kitty example, which uses virtIO GPU via Linux instead of natively, work on macOS then? Am I missing something?
Only difference there is that it's via PCI instead of MMIO.
How does the LionsOS Kitty example, which uses virtIO GPU via Linux instead of natively, work on macOS then? Am I missing something?
As was discussed offline, due to udmabuf requirement to simulate on qemu
This PR adds an initial sDDF protocol design for 2D unaccelerated gpus. It contains implementation for:
Note:
The gpu example in this PR does not work on macos QEMU due to the(edit: BLOB feature is now conditionally compiled. By default it is off. You can turn on blob resources by specifying BLOB=1 in the Makefile args).udmabuf
framework requirement, which only exists on linux.udmabuf
requires sudo to access. (edit: only true when BLOB=1 is specified in Makefile.Zig build for the example is broken. The build also makes use of imagemagick's convert tool to figure out the image resolution, which I never figured out how to get working with zig's build tool. Even if the resolution was hardcoded, running the example leads to a VMfault (there's still a bug in my build system there somewhere).There was a stack overflow problem with zig, fixed by increasing stack size. Avoided the imagemagick convert tool issue by not allowing user to supply their own image in zig.But for now I've page aligned the blob resource memory in the client to avoid this issue.Driver now does the aligning instead.Protocol Design
The sddf gpu protocol design is quite similar to virtio gpu. The protocol introduces the concept of 2D resources that describe a 2D image which can then be scanned out to a display. Clients can enqueue requests to create these 2D resources, and also manipulate these 2D resources. 2D resources have its own private memory and additionally a separate memory backing that can be modified by clients directly. Clients can then make requests to update the private memory from its attached backing. Often, efficient implementations of the driver will put this private memory in device memory. Clients can also request to create/destroy resources and attach/detach client memory to these resources. The client is expected to bookkeep and manage its own memory that it attaches to the resources.
The protocol also provides blob resources that can be similarly manipulated. As opposed to 2D resources, blob resources do not assume a pixel format and has to be casted to a framebuffer object (it then becomes something similar to 2D resources) for it to be scanned out to a display. Blob resources also don't have private memory, it relies only on client allocated memory backing to enable potentially zero-copy communication.
Queues
The queues consist of a request/response queue implemented internally by ringbuffers.
Requests
Clients can make the following requests via the request queue:
Requesting SET_SCANOUT or SET_SCANOUT_BLOB with resource id=0 will disable the scanout. No resources can be created with id=0 as it is reserved for this purpose.
Blob resources support swapping in memory backing during runtime. This can be done by requesting RESOURCE_ATTACH_BACKING and RESOURCE_DETACH_BACKING on the resource.
Blob resources can be flushed to a scanout using RESOURCE_FLUSH, just like 2D resources.
Blob resources do not interact with TRANSFER_TO_2D and SET_SCANOUT, doing so will result in an error.
2D resources do not interact with SET_SCANOUT_BLOB, doing so will result in an error.
Initialisation
As part of initialisation, clients MUST first make a GET_DISPLAY_INFO request. Client is not considered initialised until a successful GET_DISPLAY_INFO request has been responded to. The client is then expected to use the scanout information from the GET_DISPLAY_INFO response for further operation.
Events
GPU devices needs to notify the client when changes happen in the hardware. These events are synchronised by atomic accesses in shared memory. There is currently only 1 event type, which is a
display_info
event.Display info event
Blob resources and private resources
There are two types of resources a client can create. There are 2D resources which assumes a format consisting of a width, height, and pixel format that describes a 2D image. And there is blob resources which does not assume a format and only has memory with an associated size.
Request reordering
Example Operation: Creating a framebuffer and configuring a scanout
With 2D resources:
With blob resources:
Compatibility with 3D
GPU Virtualiser
The GPU virtualiser has two roles, it translates the offsets from clients into IO addresses, and it also virtualises the scanout information from the device to clients. The current implementation does the simplest thing: forward the same view of the device scanouts to all clients. Under this implementation, a client's scanout id would be identical to the device's true scanout id.
Assumptions on request failure
Bookkeeping requests from multiple clients in the GPU virtualiser is complex, and is made even more complex when these requests fail due to the asynchronous nature of the request/response queues. I've made an assumption that other than the requests which create a resource, requests should never fail under normal circumstances. And if they do there is something catastrophically wrong with either the driver/device which would render recovery of bookkeeping state meaningless. This simplifies the virtualiser drastically by avoiding complex recovery logic upon request failure. The only exception for this is when requests are rendered stale due to display info events, which thankfully, if you inspect the logic carefully does not require us to perform any complex recovery logic.
GPU VirtIO Driver
Implementation of the virtIO GPU v1.2 specification with support for VIRTIO_GPU_F_RESOURCE_BLOB feature
Future Work