rust-vmm / vhost

Apache License 2.0
126 stars 64 forks source link

ByteValued implementation for VhostUserInflight is UB #215

Open germag opened 7 months ago

germag commented 7 months ago

The VhostUserInflight struct implements ByteValued, but it has 4 bytes of padding at the end, so reading that uninitialized memory is UB, making the API unsound

/// Single memory region descriptor as payload for ADD_MEM_REG and REM_MEM_REG
/// requests.
#[repr(C)]
#[derive(Copy, Clone, Default)]
pub struct VhostUserInflight {
    /// Size of the area to track inflight I/O.
    pub mmap_size: u64,
    /// Offset of this area from the start of the supplied file descriptor.
    pub mmap_offset: u64,
    /// Number of virtqueues.
    pub num_queues: u16,
    /// Size of virtqueues.
    pub queue_size: u16,
}
...
unsafe impl ByteValued for VhostUserInflight {}

the problem is basically reading that uninitialized memory as &[u8], since type reading uinit memory is UB and a reference to uninit mem is also UB.

Possible solution

Although the right thing to do is to fix qemu and have this struct be packed, it will be difficult not to break backward compatibility.

Perhaps, we can fix this while keeping the backwards compatibility: Make VhostUserInflight #[repr(C, packed)], after reading VhostUserInflight, read MaybeUninit<u32>, we can add a new method Endpoint<R>::skip(len: usize). But, we need to skip it at the beginning of BackendReqHandler::handle_request()

pub fn handle_request(&mut self) -> Result<()> {
  let (size, buf) = match hdr.get_size() {
            0 => (0, vec![0u8; 0]),
            len => {
                let (size2, rbuf) = self.main_sock.recv_data(len as usize)?; // the UB happens inside recv_data()
                if size2 != len as usize {
                    return Err(Error::InvalidMessage);
                }
                (size2, rbuf)
            }
        };

Fix qemu (and frontend) adding an explicit padding field guaranteeing that it will be initialized to 0:

typedef struct VhostUserInflight {
    uint64_t mmap_size;
    uint64_t mmap_offset;
    uint16_t num_queues;
    uint16_t queue_size;
    uint32_t padding; // MUST be zero initialized
} VhostUserInflight;

but this forces us to have two VhostUserInflight definitions, one for the backend and one for the frontend, because the frontend must continue sending the unpacked version with padding.

This issue is to continue the discussion started on https://github.com/rust-vmm/vhost/pull/208

Ablu commented 7 months ago

Fix qemu (and frontend) adding an explicit padding field guaranteeing that it will be initialized to 0:

I still do not understand why we need this to be 0 initialized. At least from a Rust perspective we should be fine... We will read "any" padding, but it should not be undefined behaviour. When reading from volatile memory we only care that we write valid values to the type. If the padding gets fixed to an int type, any value is valid.

So IMHO we can just add the padding field and are fine...

stefano-garzarella commented 7 months ago

Fix qemu (and frontend) adding an explicit padding field guaranteeing that it will be initialized to 0:

I still do not understand why we need this to be 0 initialized. At least from a Rust perspective we should be fine... We will read "any" padding, but it should not be undefined behaviour. When reading from volatile memory we only care that we write valid values to the type. If the padding gets fixed to an int type, any value is valid.

So IMHO we can just add the padding field and are fine...

Yep, I agree on that. In any case, I don't think we should ask QEMU to put it at 0, just because in Rust it might be UB. But at most we have to do that when we read the bytes from the socket, we can just overwrite them to 0.

germag commented 7 months ago

Fix qemu (and frontend) adding an explicit padding field guaranteeing that it will be initialized to 0:

I still do not understand why we need this to be 0 initialized. At least from a Rust perspective we should be fine... We will read "any" padding, but it should not be undefined behaviour. When reading from volatile memory we only care that we write valid values to the type. If the padding gets fixed to an int type, any value is valid. So IMHO we can just add the padding field and are fine...

Yep, I agree on that. In any case, I don't think we should ask QEMU to put it at 0, just because in Rust it might be UB. But at most we have to do that when we read the bytes from the socket, we can just overwrite them to 0.

Why not?, it's not a performance-critical message (I guess), and it's not only Rust also C++ (although it has special rules for char), the C standard is more fuzzy on that aspect

germag commented 7 months ago

Fix qemu (and frontend) adding an explicit padding field guaranteeing that it will be initialized to 0:

I still do not understand why we need this to be 0 initialized. At least from a Rust perspective we should be fine... We will read "any" padding, but it should not be undefined behaviour.

This is not correct it definitely is UB, adding the explicit padding probably will work in the current rustc implementation for this specific architecture, but is not part of the language for rust language type reading uninit memory is instant UB even if you cheat the compiler

When reading from volatile memory we only care that we write valid values to the type. If the padding gets fixed to an int type, any value is valid.

uninit (i.e., undef/poison) is not a valid value for any type, but volatile memory has nothing to do with uninit memory

So IMHO we can just add the padding field and are fine...

germag commented 7 months ago

Fix qemu (and frontend) adding an explicit padding field guaranteeing that it will be initialized to 0:

I still do not understand why we need this to be 0 initialized. At least from a Rust perspective we should be fine... We will read "any" padding, but it should not be undefined behaviour. When reading from volatile memory we only care that we write valid values to the type. If the padding gets fixed to an int type, any value is valid. So IMHO we can just add the padding field and are fine...

Yep, I agree on that. In any case, I don't think we should ask QEMU to put it at 0, just because in Rust it might be UB. But at most we have to do that when we read the bytes from the socket, we can just overwrite them to 0.

Why not?, it's not a performance-critical message (I guess), and it's not only Rust also C++ (although it has special rules for char), the C standard is more fuzzy on that aspect

Maybe we can just forget about qemu(*), and just add the explicit padding for this crate implementation of the frontend, but the backend will always skip the padding, so we need 2 definitions of the struct with/without the padding. I'm afraid the code will not be pretty :|

(*) I still think that qemu should be fix making that struct packed

Ablu commented 7 months ago

This is not correct it definitely is UB, adding the explicit padding probably will work in the current rustc implementation for this specific architecture, but is not part of the language for rust language type reading uninit memory is instant UB even if you cheat the compiler

I do not think this is true. I understand that there is undefined behaviour if a struct is not fully assigned. I also understand that we need to add the padding. But we do not need to mandate a value. Regardless of the padding being initialized or not when put into shared-memory, we DO initialize it in Rust. Maybe it was not initialized in C - thats a problem when QEMU access it of course - but we do not care. We read size_of::<Struct>() into the struct. Assuming the struct is all integers and the padding is fixed, I do not see how those bytes would cause undefined behaviour. The struct got completely initialized. The padding value may be anything of course (just like any other field) but it is not undefined behaviour.

Or stated differently: As long as whatever implements ByteValued is packed and only contains integer fields, the ByteValued abstraction should be safe.

So: Could you elaborate where you think the struct may not be fully assigned? :thinking:

stefano-garzarella commented 7 months ago

I think we need to split the issue in 2 small issues and address them separated:

  1. Fix the potential UB in this crate
  2. Extend the specification to use packed or an explicit padding

For 1, we should provide a fix that can work with any QEMU/VMM, also old ones. I don't understand why if we add just the padding here and we set it to 0 every time we read from the socket is still UB.

For 2, we need to update the spec and add a feature for that. I'm not sure if it is worth it, but I can't see how we can change that (adding packed, or adding a payload) without breaking compatibility.

germag commented 7 months ago

I think we need to split the issue in 2 small issues and address them separated:

1. Fix the potential UB in this crate

2. Extend the specification to use packed or an explicit padding

For 1, we should provide a fix that can work with any QEMU/VMM, also old ones. I don't understand why if we add just the padding here and we set it to 0 every time we read from the socket is still UB.

Maybe I didn't explain myself correctly, it's UB only if there is unit mem, whether the padding is implicit or explicit. If we read 4 bytes as MaybeUninit<u32> is not UB

For 2, we need to update the spec and add a feature for that. I'm not sure if it is worth it, but I can't see how we can change that (adding packed, or adding a payload) without breaking compatibility.

I agree I probably got ahead of myself when talking about qemu, sorry for the confusion.

stefano-garzarella commented 7 months ago

For 1, we should provide a fix that can work with any QEMU/VMM, also old ones. I don't understand why if we add just the padding here and we set it to 0 every time we read from the socket is still UB.

Maybe I didn't explain myself correctly, it's UB only if there is unit mem, whether the padding is implicit or explicit. If we read 4 bytes as MaybeUninit<u32> is not UB

The thing that's not clear to me is why only the padding has to be MaybeUninit<u32> and not all the fields at this point.

Also unclear to me is why if the remote application sending bytes puts the padding at 0 it's not UB, while if it puts random values in it is UB. Because in the end we're passing a buffer to the read() syscall and the kernel anyway initializes the padding to some values. I still don't understand why if those are 0 or they're something else, it changes something. What the kernel does or the remote application does is not under rustc control. But if it is the case, I think we can't trust the remote peer in order to have our code not UB. So IMHO we should don't trust the remote application (and compiler and kernel at all) and if there is something that could be UB, we need to handle in our read() wrapper.

Ablu commented 7 months ago

The thing that's not clear to me is why only the padding has to be MaybeUninit<u32> and not all the fields at this point.

Also unclear to me is why if the remote application sending bytes puts the padding at 0 it's not UB, while if it puts random values in it is UB.

This is the same concern that I have and what @germag and me have been haggling about :)

Because in the end we're passing a buffer to the read() syscall and the kernel anyway initializes the padding to some values.

This is true for what we read from sockets. But it should not really matter. The story should be the same for reading from shared memory. We may read arbitrary data, but we never leave anything uninitialized.

Ultimately, the bug that we have is that VhostUserInflight is not packed (which the safety contract of ByteValued mandates). Of course fixing that may have implications on backwards-compatibility. I think a quick-fix may be to mark it as packed and add a manual padding int. That is a major bump since currently all members are public, but it should preserve current behavior overall.

The best fix - as you suggested - is of course to fix the ambiguity in the spec.

Ablu commented 7 months ago

https://github.com/rust-vmm/vm-memory/issues/246 would help with these kind of issues...