It would be nice to provide safe Rust APIs that use normal Rust types like Vec<_, std::alloc::System>, Box<_, std::alloc::System> with C libraries that expect to free pointers allocated by their clients, or inversely, to malloc pointers that their clients later free.
Currently this can't be done, and the only safe way to interact with the pointers is to use libc::malloc and libc::free on the Rust side. This is for at least a few reasons that I can identify:
First, it is explicitly not allowed according to the documentation of std::alloc::System, which says: "it is not valid to mix use of the backing system allocator with System, as this implementation may include extra work, such as to serve alignment requests greater than the alignment provided directly by the backing system allocator."
Second, even if it were allowed, it would require the C API to inform its clients of the capacity and length of allocated pointers. For example, consider these code snippets:
// C code
typedef struct {
size_t len;
unsigned char *buf;
} sized_buf_t;
// clone a zero-terminated string into a length+buffer string
sized_buf_t copy_cstr(const char *cstr) {
size_t cap = 0;
size_t len = 0;
unsigned char *buf = NULL;
for (const char *p = cstr; *p; ++p) {
if (len == cap) {
cap = MAX(1, 2 * cap);
buf = realloc(out, cap);
}
buf[len++] = *(unsigned char*)p;
}
return {len, buf};
}
// Rust code
use std::alloc::System;
fn copy_cstr(cstr: &CStr) -> Vec<u8, System> {
unsafe {
let sb = c::copy_cstr(cstr.as_ptr());
Vec::from_raw_parts_in(sb.buf, sb.len, todo!("we don't know the capacity..."), System)
}
}
This code can't be made to work, because we are not allowed to construct a Vec without knowing the capacity of the underlying allocation. Even though the underlying free doesn't care, the documentation of Vec::from_raw_parts_in prohibits this. See e.g. https://github.com/rust-lang/wg-allocators/issues/99 for further discussion.
Third, even if the above two issues were solved, the ergonomics are still bad because (AFAIK) there's no automatic way to convert from Vec<_, System> to Vec<_>, even if the global allocator is indeed System. It would be nice if we could write code like this:
let v: Vec<_> = copy_cstr(c"foo");
and have it compile in programs where the global allocator is System, and fail with a type error otherwise. However, even in such programs, Vec<T, System> and Vec<T, Global> are still different types.
It would be nice to provide safe Rust APIs that use normal Rust types like
Vec<_, std::alloc::System>
,Box<_, std::alloc::System>
with C libraries that expect to free pointers allocated by their clients, or inversely, to malloc pointers that their clients later free.Currently this can't be done, and the only safe way to interact with the pointers is to use
libc::malloc
andlibc::free
on the Rust side. This is for at least a few reasons that I can identify:First, it is explicitly not allowed according to the documentation of
std::alloc::System
, which says: "it is not valid to mix use of the backing system allocator with System, as this implementation may include extra work, such as to serve alignment requests greater than the alignment provided directly by the backing system allocator."Second, even if it were allowed, it would require the C API to inform its clients of the capacity and length of allocated pointers. For example, consider these code snippets:
This code can't be made to work, because we are not allowed to construct a
Vec
without knowing the capacity of the underlying allocation. Even though the underlyingfree
doesn't care, the documentation ofVec::from_raw_parts_in
prohibits this. See e.g. https://github.com/rust-lang/wg-allocators/issues/99 for further discussion.Third, even if the above two issues were solved, the ergonomics are still bad because (AFAIK) there's no automatic way to convert from
Vec<_, System>
toVec<_>
, even if the global allocator is indeedSystem
. It would be nice if we could write code like this:and have it compile in programs where the global allocator is
System
, and fail with a type error otherwise. However, even in such programs,Vec<T, System>
andVec<T, Global>
are still different types.