rustwasm / team

A point of coordination for all things Rust and WebAssembly
MIT License
1.45k stars 59 forks source link

Usage of host provided allocator with libstd structures #253

Closed pepyakin closed 5 years ago

pepyakin commented 5 years ago

Let's say we are writing a program, which has a plugin system that we want to implement with wasm. Because we care about the size of a plugin, we decided to move out the allocator and make the host to define malloc and free functions.

Let's say that a plugin can access some KV storage with arbitrary sized values. So we write a host function that receives a key, fetches a value from the storage, allocates a memory buffer by calling malloc, writes it with the fetched data and returns control back to the wasm. (Note, that if the host didn't provide an allocator, then you would have to deal with some sort of 2-phase (fetch-allocate) mechanism which involves more overhead).

Then, let's consider how this API would be implemented on the wasm side.

extern "C" {
  /// Fetches a kv store with the given key. 
  ///
  /// If the value is found, allocates an appropriate buffer 
  /// and then writes back the pointer of it and length to it.
  /// Otherwise, traps, for simplicity.
  fn fetch_kv(
    key_ptr: *const u8,
    key_size: usize,
    value_ptr_ptr: *mut *mut u8,
    len_ptr_ptr: *mut usize
  );
}

Then imagine how would you implement the high-level wrapper for that, that wants to return Vec<u8>.

fn fetch_kv(key: &[u8]) -> Vec<u8> {
  let mut value_ptr = ptr::null();
  let mut value_size = 0;
  unsafe { 
    ffi::fetch_kv(key.as_ptr(), key.len(), &mut value_ptr, &mut value_size); 
    Vec::from_raw_parts(value_ptr, value_size, value_size)
  }
}

All well and good, except that this is not valid, because Vec::from_raw_parts because it has the following precondition:

  • ptr needs to have been previously allocated via String/Vec<T> (at least, it's highly likely to be incorrect if it wasn't).`

which, I suppose, we can't guarantee since the buffer is allocated by the host provided malloc. However, in practice (assuming that we set our global allocator to the one which uses host provided allocation routines), e.g. Vec::with_capacity will ultimately call to the system provided malloc.

Yes, of course, there are other possible solutions:

so using Vec is really preferred.

So I wonder, what are the exact reasons for this invariant be in place? Can we violate this invariant safely, given that the same platform (wasm32) and the same allocator (host provided) will be used? Or is it still possible that an update of libstd could break the code?

alexcrichton commented 5 years ago

I believe the best way to handle this idiomatically today is to create a wrapper struct that has a custom destructor to drop the contents as well as Derefs to a slice, like Vec. The standard library can't guarantee that it works when straddling two copies of itself, much less two entirely different platforms!

I don't think we can change this in upstream rust-lang/rust any time soon, but I don't think that it means this has to become unergonomic, it's otherwise likely just falling into the bucket of "how FFI idiomatically works in Rust"

alexcrichton commented 5 years ago

I'm gonna go ahead and close this as I think it's answered, but let me know if it's in error!