rust-vmm / mshv

Crates for Microsoft Hypervisor ioctls and bindings
Apache License 2.0
29 stars 12 forks source link

Add necessary APIs to interact with synthetic state components #136

Closed russell-islam closed 5 months ago

russell-islam commented 5 months ago

Summary of the PR

This PR adds necessary structs, APIs, unit tests to save and restore synthetic state components

Requirements

Before submitting your PR, please make sure you addressed the following requirements:

NunoDasNeves commented 5 months ago

I wonder if we can avoid having so much bespoke code for all these different kinds of state. The main use for this API is save/restore. We could just loop over the different kinds of state, storing the result in opaque fixed-size buffers (e.g. of 4096 bytes, which is the max state size we know about). And then do the same in reverse to restore the state. I'm not sure how much code change it would require on the CLH side...

Being able to set/get LAPIC state by itself seems useful, because CLH actually uses it for something other than save/restore state.

russell-islam commented 5 months ago

I wonder if we can avoid having so much bespoke code for all these different kinds of state. The main use for this API is save/restore. We could just loop over the different kinds of state, storing the result in opaque fixed-size buffers (e.g. of 4096 bytes, which is the max state size we know about). And then do the same in reverse to restore the state. I'm not sure how much code change it would require on the CLH side...

Being able to set/get LAPIC state by itself seems useful, because CLH actually uses it for something other than save/restore state.

I see your point. That would unify the API but for readability and API differentiation I would prefer the current approach. I was thinking the same as yours then I looked into hvlite code. hvlite also has different api and struct.

For now we have three states that are similar (xsave, simp, and sief). For synthetic timers we do want keep it separate.

@liuw @jinankjain any thoughts. I am open to anything that we can agree.

jinankjain commented 5 months ago

In general, I had one more question with this serialization/deserialization logic? Is there a way to versionize these structs, if some of these fields in these structs changes in future? Is there a way to downgrade and upgrade between these versions?

russell-islam commented 5 months ago

In general, I had one more question with this serialization/deserialization logic? Is there a way to versionize these structs, if some of these fields in these structs changes in future? Is there a way to downgrade and upgrade between these versions?

No way at the moment to versionalize these structs. Even KVM does not have anything.

liuw commented 5 months ago

I wonder if we can avoid having so much bespoke code for all these different kinds of state. The main use for this API is save/restore. We could just loop over the different kinds of state, storing the result in opaque fixed-size buffers (e.g. of 4096 bytes, which is the max state size we know about). And then do the same in reverse to restore the state. I'm not sure how much code change it would require on the CLH side... Being able to set/get LAPIC state by itself seems useful, because CLH actually uses it for something other than save/restore state.

I see your point. That would unify the API but for readability and API differentiation I would prefer the current approach. I was thinking the same as yours then I looked into hvlite code. hvlite also has different api and struct.

For now we have three states that are similar (xsave, simp, and sief). For synthetic timers we do want keep it separate.

@liuw @jinankjain any thoughts. I am open to anything that we can agree.

I don't have a clear answer to this. The downside of the fixed size buffers is it unnecessarily increases the amount of data transferred a lot, since most of the states will be far smaller than 4K. Imagine in the future you want to migrate a VM with hundreds VCPUs. That adds up quickly.

russell-islam commented 5 months ago

I wonder if we can avoid having so much bespoke code for all these different kinds of state. The main use for this API is save/restore. We could just loop over the different kinds of state, storing the result in opaque fixed-size buffers (e.g. of 4096 bytes, which is the max state size we know about). And then do the same in reverse to restore the state. I'm not sure how much code change it would require on the CLH side... Being able to set/get LAPIC state by itself seems useful, because CLH actually uses it for something other than save/restore state.

I see your point. That would unify the API but for readability and API differentiation I would prefer the current approach. I was thinking the same as yours then I looked into hvlite code. hvlite also has different api and struct. For now we have three states that are similar (xsave, simp, and sief). For synthetic timers we do want keep it separate. @liuw @jinankjain any thoughts. I am open to anything that we can agree.

I don't have a clear answer to this. The downside of the fixed size buffers is it unnecessarily increases the amount of data transferred a lot, since most of the states will be far smaller than 4K. Imagine in the future you want to migrate a VM with hundreds VCPUs. That adds up quickly.

There will be no unnecessary data. Nuno's suggestion is to unify three state components, all will be 4k buffer. API will have an extra parameter that indicates the type of the state(xsave, simp or sief). API will accept/return a 4k buffer always. @NunoDasNeves Please correct me if I am wrong.

liuw commented 5 months ago

I wonder if we can avoid having so much bespoke code for all these different kinds of state. The main use for this API is save/restore. We could just loop over the different kinds of state, storing the result in opaque fixed-size buffers (e.g. of 4096 bytes, which is the max state size we know about). And then do the same in reverse to restore the state. I'm not sure how much code change it would require on the CLH side... Being able to set/get LAPIC state by itself seems useful, because CLH actually uses it for something other than save/restore state.

I see your point. That would unify the API but for readability and API differentiation I would prefer the current approach. I was thinking the same as yours then I looked into hvlite code. hvlite also has different api and struct. For now we have three states that are similar (xsave, simp, and sief). For synthetic timers we do want keep it separate. @liuw @jinankjain any thoughts. I am open to anything that we can agree.

I don't have a clear answer to this. The downside of the fixed size buffers is it unnecessarily increases the amount of data transferred a lot, since most of the states will be far smaller than 4K. Imagine in the future you want to migrate a VM with hundreds VCPUs. That adds up quickly.

There will be no unnecessary data. Nuno's suggestion is to unify three state components, all will be 4k buffer. API will have an extra parameter that indicates the type of the state(xsave, simp or sief). API will accept/return a 4k buffer always. @NunoDasNeves Please correct me if I am wrong.

Okay, I misread. Sorry for the noise.

I don't have a strong opinion then.

NunoDasNeves commented 5 months ago

To sum it up - the IOCTL API needs a page-aligned, page-multiple-sized buffer, but the VP state data doesn't have to be stored that way. It could even be concatenated into one big buffer on the heap, so no space is wasted.

Here's some 'pseudorust':

struct AllTheVpState {
    offsets: [u64; MSHV_VP_STATE_COUNT], // each value is an index into buffer. The sizes can be worked out from this.
    buffer: Vec<u8>,
};

fn get_all_the_vp_state() -> Result<AllTheVpState> {
    mut buffer = Buffer::new(HV_PAGE_SIZE, HV_PAGE_SIZE); // assume this is big enough for any VP state
    mut ret = AllTheVpState{
        ..Default::default()
    };
    for i in 0..MSHV_VP_STATE_COUNT {
        mut vp_state = mshv_get_set_vp_state{
            buf_ptr: buffer.buf as u64,
            buf_sz: buffer.size() as u32,
            type_: i as u8,
            ..Default::default()
        };
        self.get_vp_state_ioctl(&mut vp_state)?;
        let actual_size = vp_state.buf_sz;
        ret.offsets[i] = ret.buffer.len();
        // idk if this is a real function...
        ret.buffer.append_from_pointer(buffer.buf, actual_size);
    }
    Ok(ret)
}

fn set_all_the_vp_state(state: &AllTheVpState) -> Result<()> {
    mut buffer = Buffer::new(HV_PAGE_SIZE, HV_PAGE_SIZE); // assume this is big enough for any VP state
    for i in 0..MSHV_VP_STATE_COUNT {
        let pointer_to_data = state.buffer.as_ptr().offset(ret.offsets[i]);
        let data_size = if (i < MSHV_VP_STATE_COUNT - 1) ret.offsets[i + 1] else ret.buffer.len() - ret.offsets[i];
        memcpy(pointer_to_data, buffer.buf, data_size);
        mut vp_state = mshv_get_set_vp_state{
            buf_ptr: buffer.buf as u64,
            buf_sz: buffer.size() as u32,
            type_: i as u8,
            ..Default::default()
        };
        self.set_vp_state_ioctl(&mut vp_state)?;
        let actual_size = vp_state.buf_sz;
        assert(actual_size == data_size);
    }
    Ok(ret)
}
NunoDasNeves commented 5 months ago

Here's another option. I really think storing it like this will simplify the whole thing and be less error prone than this huge amount of manual (de)serialization:

struct AllTheVpState {
    xsave: [u8; 4096],
    lapic: [u8; 1024],
    simp: [u8, 4096],
    siefp: [u8, 4096],
    stimers: hv_synthetic_timers_state, // or another byte array of the correct size if you don't want to expose this to CLH
};

You can still retrieve the LAPIC state separately with the existing code, of course.

russell-islam commented 5 months ago

To sum it up - the IOCTL API needs a page-aligned, page-multiple-sized buffer, but the VP state data doesn't have to be stored that way. It could even be concatenated into one big buffer on the heap, so no space is wasted.

Here's some 'pseudorust':

struct AllTheVpState {
    offsets: [u64; MSHV_VP_STATE_COUNT], // each value is an index into buffer. The sizes can be worked out from this.
    buffer: Vec<u8>,
};

fn get_all_the_vp_state() -> Result<AllTheVpState> {
    mut buffer = Buffer::new(HV_PAGE_SIZE, HV_PAGE_SIZE); // assume this is big enough for any VP state
    mut ret = AllTheVpState{
        ..Default::default()
    };
    for i in 0..MSHV_VP_STATE_COUNT {
        mut vp_state = mshv_get_set_vp_state{
            buf_ptr: buffer.buf as u64,
            buf_sz: buffer.size() as u32,
            type_: i as u8,
            ..Default::default()
        };
        self.get_vp_state_ioctl(&mut vp_state)?;
        let actual_size = vp_state.buf_sz;
        ret.offsets[i] = ret.buffer.len();
        // idk if this is a real function...
        ret.buffer.append_from_pointer(buffer.buf, actual_size);
    }
    Ok(ret)
}

fn set_all_the_vp_state(state: &AllTheVpState) -> Result<()> {
    mut buffer = Buffer::new(HV_PAGE_SIZE, HV_PAGE_SIZE); // assume this is big enough for any VP state
    for i in 0..MSHV_VP_STATE_COUNT {
        let pointer_to_data = state.buffer.as_ptr().offset(ret.offsets[i]);
        let data_size = if (i < MSHV_VP_STATE_COUNT - 1) ret.offsets[i + 1] else ret.buffer.len() - ret.offsets[i];
        memcpy(pointer_to_data, buffer.buf, data_size);
        mut vp_state = mshv_get_set_vp_state{
            buf_ptr: buffer.buf as u64,
            buf_sz: buffer.size() as u32,
            type_: i as u8,
            ..Default::default()
        };
        self.set_vp_state_ioctl(&mut vp_state)?;
        let actual_size = vp_state.buf_sz;
        assert(actual_size == data_size);
    }
    Ok(ret)
}

We want to keep LAPIC and SynicTimers excluded form this. Right? Merging all the states to one buffer seems overkill and make the reading complex while these states and APIs are too simple. I am not sure what benefits give us if we use just use one big buffer.

russell-islam commented 5 months ago

Here's another option. I really think storing it like this will simplify the whole thing and be less error prone than this huge amount of manual (de)serialization:

struct AllTheVpState {
    xsave: [u8; 4096],
    lapic: [u8; 1024],
    simp: [u8, 4096],
    siefp: [u8, 4096],
    stimers: hv_synthetic_timers_state, // or another byte array of the correct size if you don't want to expose this to CLH
};

You can still retrieve the LAPIC state separately with the existing code, of course.

I am not sure though how can I serialize and deserialize it?

russell-islam commented 5 months ago

Here's another option. I really think storing it like this will simplify the whole thing and be less error prone than this huge amount of manual (de)serialization:

struct AllTheVpState {
    xsave: [u8; 4096],
    lapic: [u8; 1024],
    simp: [u8, 4096],
    siefp: [u8, 4096],
    stimers: hv_synthetic_timers_state, // or another byte array of the correct size if you don't want to expose this to CLH
};

You can still retrieve the LAPIC state separately with the existing code, of course.

We are minimizing the API Call to mshv-ioctls not the hypercall, right? If we use this structure we will endup copying the buffer to each this array, and in a single API we will do 5 IOCTLS, doesn't it look messy?

NunoDasNeves commented 5 months ago

We are minimizing the API Call to mshv-ioctls not the hypercall, right? If we use this structure we will endup copying the buffer to each this array, and in a single API we will do 5 IOCTLS, doesn't it look messy?

It's not about minimizing the API calls, but the amount of code. For what we need, I think having a separate get_*() function and data structures for each type of state is not necessary.

We could also change the IOCTL interface if you like.

russell-islam commented 5 months ago

We are minimizing the API Call to mshv-ioctls not the hypercall, right? If we use this structure we will endup copying the buffer to each this array, and in a single API we will do 5 IOCTLS, doesn't it look messy?

It's not about minimizing the API calls, but the amount of code. For what we need, I think having a separate get_*() function and data structures for each type of state is not necessary.

We could also change the IOCTL interface if you like.

@NunoDasNeves After thinking a bit more I have decided to use one large buffer. It is more cleaner that way. Making this PR as draft and I will let you know once ready to review.

russell-islam commented 5 months ago

Thanks @NunoDasNeves for your through review. I have addressed your comments.

praveen-pk commented 5 months ago

Apart from above comments, I don't have much to add. LGTM!