rust-lang / wg-allocators

Home of the Allocators working group: Paving a path for a standard set of allocator traits to be used in collections!
http://bit.ly/hello-wg-allocators
207 stars 9 forks source link

Defining 'keep out' regions of memory #26

Open ckaran opened 5 years ago

ckaran commented 5 years ago

I've been reading the notes on mmap on linux here (quoted below)

Using MAP_FIXED safely The only safe use for MAP_FIXED is where the address range specified by addr and length was previously reserved using another mapping; otherwise, the use of MAP_FIXED is hazardous because it forcibly removes preexisting mappings, making it easy for a multithreaded process to corrupt its own address space.

For example, suppose that thread A looks through /proc//maps and in order to locate an unused address range that it can map using MAP_FIXED, while thread B simultaneously acquires part or all of that same address range. When thread A subsequently employs mmap(MAP_FIXED), it will effectively clobber the mapping that thread B created. In this scenario, thread B need not create a mapping directly; simply making a library call that, internally, uses dlopen(3) to load some other shared library, will suffice. The dlopen(3) call will map the library into the process's address space. Furthermore, almost any library call may be implemented in a way that adds memory mappings to the address space, either with this technique, or by simply allocating memory. Examples include brk(2), malloc(3), pthread_create(3), and the PAM libraries ⟨http://www.linux-pam.org⟩.

Since Linux 4.17, a multithreaded program can use the MAP_FIXED_NORE‐PLACE flag to avoid the hazard described above when attempting to create a mapping at a fixed address that has not been reserved by a preexisting mapping.

Which got me to thinking; how do I tell an allocator that I don't want it to use some address range in a portable way? I mean, they outline the solution for Linux, but since Rust is targeting a wide range of systems, it appears that there needs to be a way of telling whatever is handling memory to either leave some range alone, or tell it that it can only touch a given range.

Come to think of it, there may need to be some portable and extensible (i.e., future-proof) way of giving address ranges attributes, so that the allocator can be smarter about what it does, in case there are additional attributes that we'd want to give a range of memory.

gnzlbg commented 5 years ago

AFAICT the solution outlined there only acquires new memory from the Linux kernel using mmap, which essentially turns the code that does that into an allocator, which is independent from all other ones.

For example, you can't call C realloc with a pointer to that memory, because the C global allocator doesn't know anything about it.

I mean, they outline the solution for Linux, but since Rust is targeting a wide range of systems, it appears that there needs to be a way of telling whatever is handling memory to either leave some range alone, or tell it that it can only touch a given range.

Which memory allocators support this ? For example, is there an API for glibc' allocator, jemalloc, etc. to do this ?

Also, which operating systems support doing this ? If only Linux/Android do, then at best any solution will be "portable" to those systems only. Or how do you imagine such a solution being implemented on an OS that does not support that ? I suppose an allocator could check, for every allocation returning from the OS, whether it overlaps with some address range, and just free it, and try allocating again, until a suitable allocation is returned. But chances are that if you make the exact same request after freeing that memory, you will get the exact same region of memory back, entering an infinite loop.

ckaran commented 5 years ago

@gnzlbg wrote:

Which memory allocators support this ? For example, is there an API for glibc' allocator, jemalloc, etc. to do this?

I did a very quick scan of glibc's allocator API, and jemalloc's API. It appears that glibc does have hooks to tune how malloc operates. Start reading at https://www.gnu.org/software/libc/manual/html_mono/libc.html#Memory, there is quite a bit to go through. It appears that glibc is using a combination of sbrk and mmap, failing over to the latter when the former doesn't work for one reason or another. It also has the mallopt function which allows you to tune how malloc works. However, I didn't see a 'keep-out' type of flag.

jemalloc's API has the extent_hooks API, which, when reading through it, implies that it is possible to request memory starting at certain absolute memory addresses. I don't see a way of informing jemalloc that a given range shouldn't be used, or is special in some way.

Also, which operating systems support doing this?

I don't know, I haven't researched this enough.

Or how do you imagine such a solution being implemented on an OS that does not support that?

I expect that if you try to set any unsupported attributes, then there will be a runtime error raised (a Result of some kind). I don't think that it would be possible to detect this kind of problem at compile time, but I may be wrong.