rust-vmm / vm-virtio

virtio implementation
Apache License 2.0
364 stars 87 forks source link

virtio-vsock: spec violation about the number of descriptors per packet #204

Closed stefano-garzarella closed 1 year ago

stefano-garzarella commented 1 year ago

In crates/devices/virtio-vsock/src/packet.rs for both the TX and RX queue, we assume that each packet always consists of 2 descriptors (returning DescriptorChainTooShort error otherwise), but this is not in the specification.

It was an implementation detail true until Linux 6.1, but now it's about to change. Testing vhost-user-vsock (which uses this crate) with the proposed Linux patch, I noticed that it doesn't work, printing this error:

WARN  vhost_user_vsock::vhu_vsock_thread] vsock: RX queue error: DescriptorChainTooShort

In the vhost-vsock device available in Linux, we don't assume that, so I think we should do the same here to respect the virtio specification.

stefano-garzarella commented 1 year ago

I would like to fix it before we release the new driver (not before Linux 6.2). @lauralt can you take a look at it? If not, I'll try to look at it next week.

lauralt commented 1 year ago

Yes, the implementation was Linux specific at that point, it is mentioned in a doc comment as well. I agree this should be fixed, but other things were prioritised. Having a look again at the code, it won't be super straightforward to fix this, and it will probably be ugly since we don't have the Reader and Writer abstractions that crosvm for example uses (there is a PR open for this in vm-virtio as well https://github.com/rust-vmm/vm-virtio/pull/33 for quite some time already). I can take a better look early next week.

Out of curiosity, does it fail with the error you mentioned when having multiple descriptors for the data or?

stefano-garzarella commented 1 year ago

Yes, the implementation was Linux specific at that point, it is mentioned in a doc comment as well.

I didn't know that, thanks for pointing it out!

I agree this should be fixed, but other things were prioritised. Having a look again at the code, it won't be super straightforward to fix this, and it will probably be ugly since we don't have the Reader and Writer abstractions that crosvm for example uses (there is a PR open for this in vm-virtio as well #33 for quite some time already). I can take a better look early next week.

Great! Thanks, let me know if I can help in some way.

Out of curiosity, does it fail with the error you mentioned when having multiple descriptors for the data or?

Nope, in this case we have a single descriptor for header and data (in RX)

lauralt commented 1 year ago

Yes, the implementation was Linux specific at that point, it is mentioned in a doc comment as well.

I didn't know that, thanks for pointing it out!

I agree this should be fixed, but other things were prioritised. Having a look again at the code, it won't be super straightforward to fix this, and it will probably be ugly since we don't have the Reader and Writer abstractions that crosvm for example uses (there is a PR open for this in vm-virtio as well #33 for quite some time already). I can take a better look early next week.

Great! Thanks, let me know if I can help in some way.

Out of curiosity, does it fail with the error you mentioned when having multiple descriptors for the data or?

Nope, in this case we have a single descriptor for header and data (in RX)

Aaaah yes that is possible since for the vsock device the header and the data have the same type (device readable or writable), so you can keep them in a single descriptor, I forgot about that for a second. Thanks for the clarification.

lauralt commented 1 year ago

@stefano-garzarella I had a better look at the code, it should be doable with the current abstraction (but didn't do any POC yet). How I imagine the fix would look like: instead of a single VolatileSlice, we use an array of VolatileSlices for each of the header and data. Then, when we parse the RX and TX chains, we append each slice corresponding to a descriptor to either the header or the data depending on the number of bytes we consumed up until that point (less than the header size or more). For the particular situation you mentioned, we will start with one VolatileSlice on which we can use split_at to split into two VolatileSlices, one for the header and one for the data. I don't have bandwidth to take care of this in the near future, but I'm happy to help/review if you want to work on it.

This would probably be a good time to invest in reviewing/updating https://github.com/rust-vmm/vm-virtio/pull/33 since the Reader and Writer abstractions should simplify this problem (which we ultimately have in each device implementation).

stefano-garzarella commented 1 year ago

@lauralt thank you for the helpful information!

I'll see if I can next week, but if we merge the patch into Linux, I think we should at least have a workaround right away and then maybe refine the implementation.

stefano-garzarella commented 1 year ago

@lauralt I pushed #207. For now I have not implemented the VolatileSlices array, both because I have limited time (I will be off the next 2 weeks) and because it requires an API change.

So I thought for now to just support the single descriptor case to run the device with Linux v6.2 without changing API (we might release virtio-vsock 0.2.1).

And then think better about your suggestion and change the API.

What do you think?

lauralt commented 1 year ago

@lauralt I pushed #207. For now I have not implemented the VolatileSlices array, both because I have limited time (I will be off the next 2 weeks) and because it requires an API change.

So I thought for now to just support the single descriptor case to run the device with Linux v6.2 without changing API (we might release virtio-vsock 0.2.1).

And then think better about your suggestion and change the API.

What do you think?

@stefano-garzarella Sounds good. Thanks for taking care of this! I will have time to review the PR this Wednesday at the earliest. Does that work for you?

stefano-garzarella commented 1 year ago

@stefano-garzarella Sounds good. Thanks for taking care of this! I will have time to review the PR this Wednesday at the earliest. Does that work for you?

@lauralt yep, sure!