Open dorindabassey opened 3 days ago
@stefano-garzarella is suggesting that we remove the gpu_socket
feature flag. which I agree can solve the feature unification problem.
more suggestions are welcomed.
Just a bit more context. The issue was discovered while adding vhost-device-gpu
crate in a workspace with other crates using the vhost-user-backend
crate, see https://github.com/rust-vmm/vhost-device/pull/668
Since the new crate enable gpu-socket
feature, this is enabled by cargo also for other crates in the workspace to minimize the number of copies of the crate, see https://doc.rust-lang.org/cargo/reference/features.html#feature-unification
At this point, we had the following issue on another crate in the same workspace (vhost-device-console
) not enabling that feature:
Compiling vhost-device-console v0.1.0 (/workdir/staging/vhost-device-console)
error[E0046]: not all trait items implemented, missing: `set_gpu_socket`
--> vhost-device-console/src/vhu_console.rs:641:1
|
641 | impl VhostUserBackendMut for VhostUserConsoleBackend {
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ missing `set_gpu_socket` in implementation
The first fix we had in mind was in some way to avoid the feature unification, maybe forcing cargo to use 2 copies, but at that point we discovered the recommendation in https://doc.rust-lang.org/cargo/reference/features.html#feature-unification that also @dorindabassey reported:
That is, enabling a feature should not disable functionality, and it should usually be safe to enable any combination of features. A feature should not introduce a SemVer-incompatible change.
So maybe we should avoid that here, so I was thinking of not really removing the feature, but something like just providing default implementation for VhostUserBackend that should not be used if the feature is not enabled (maybe we can panic there):
diff --git a/vhost-user-backend/src/backend.rs b/vhost-user-backend/src/backend.rs
index ad9e950..2ba7576 100644
--- a/vhost-user-backend/src/backend.rs
+++ b/vhost-user-backend/src/backend.rs
@@ -87,13 +87,12 @@ pub trait VhostUserBackend: Send + Sync {
/// function.
fn set_backend_req_fd(&self, _backend: Backend) {}
- #[cfg(feature = "gpu-socket")]
/// Set handler for communicating with the frontend by the gpu specific backend communication
/// channel.
///
/// This method only exits when the crate feature gpu-socket is enabled, because this is only
/// useful for a gpu device.
- fn set_gpu_socket(&self, _gpu_backend: GpuBackend);
+ fn set_gpu_socket(&self, _gpu_backend: GpuBackend) {}
/// Get the map to map queue index to worker thread index.
///
(END)
@germag @epilys any suggestion?
a device specific method in a general trait does not make much sense. I suggest adding something like a pub trait VhostUserGpuBackend: VhostUserBackend
with this method and implement it for for the vhost-device-gpu
crate
@epilys it's not really device specific, IIRC it's part of the spec, but I may be wrong. In any case it's true that only the GPU device will use it, so a new trait makes sense.
@dorindabassey @mtjhrc do you remember why we did that?
@stefano-garzarella no no you're not wrong, it's part of the spec. But since it's for GPU stuff only, it doesn't make sense to have it on the general backend trait. A sub-trait for backends that use the vhost user GPU protocol sounds like a good solution to me.
yes, it's part of the spec in https://qemu-project.gitlab.io/qemu/interop/vhost-user-gpu.html
@dorindabassey @mtjhrc do you remember why we did that?
this feature gpu-socket
was added in the vhost dependency to ensure that it remains specific to GPU device.
I think we can make it an optional protocol feature just like set_backend_req_fd
Or use the type system with a sub-trait :) It's there, why not use it?
Yes, using a sub-trait seems nicer. Though I am not sure if it is worth the extra complexity/code.
There is the trait VhostUserBackend
but we also have VhostUserBackendMut
and the impls:
impl<T: VhostUserBackendMut> VhostUserBackend for Mutex<T> {
impl<T: VhostUserBackendMut> VhostUserBackend for RwLock<T> {
So if we want a Gpu
subtrait we need have both GpuVhostUserBackendMut
and GpuVhostUserBackend
. We also want to have impls for Mutex<T>
and RwLock<T>
to be consistent.
Then VhostUserDaemon
and internal code needs to be extended to also work with the extended trait.
https://github.com/rust-vmm/vhost/blob/4f160320a86a27579a3a3373b590dde2c27959a6/vhost-user-backend/src/lib.rs#L98-L113
The proper way do this I imagine is to introduce yet another trait, to be able to accept both GpuVhostUserBackend
and VhostUserBackend
. The trait GpuVhostUserBackend
wouldn't exists in the crate if gpu-socket
is not enabled)
While this is all doable, it just seems quite complex, and I am not sure if it is worth it.
Two simpler solutions are: 1) Just remove the existence of the feature flag and have it enabled for all devices.
2) Keep the feature flag but have default implementation of the method set_gpu_socket
even if the feature is not enabled. It should probably just panic. Note that this means we also need to have the struct GpuBackend
defined (as an empty struct probably) if we want to be able to define the method signature:
fn set_gpu_socket(&self, _gpu_backend: GpuBackend) {
Side note: Having a separate Mut
trait and also impls for Mutex<T>
and RwLock<T>
are simply bad abstractions and code. Bad abstractions are code smell, and now the smell shows up. :(
That's not for the gpu device to solve, so a default impl with a panic!
if the feature is disabled should be the way to go for now then.
I agree, I don't like the current abstraction either, there are also other things in vhost-user-backend
- mainly the framework spawning worker threads instead of the library user having more control (It can also complicate the device code in some circumstances). I think I'll open an issue to discuss that. This is kind of related to the separate Mut and Non-Mut trait variants.
While this is all doable, it just seems quite complex, and I am not sure if it is worth it.
I actually tried this solution and yes it quite a complicated approach.
Just remove the existence of the feature flag and have it enabled for all devices.
Keep the feature flag but have default implementation of the method set_gpu_socket even if the feature is not enabled. It should probably just panic. Note that this means we also need to have the struct GpuBackend defined (as an empty struct probably) if we want to be able to define the method signature:
we can remove the feature flag and have default implementation of the method set_gpu_socket
like set_backend_req_fd
. so if other devices do not implement the set_gpu_socket
it won't be a problem.
since the set_gpu_socket feature is behind a feature flag, this cause feature unification issues, because vhost-device-gpu is enabling that feature and other crates are not enabling it. According to rust feature reference: