danburkert / memmap-rs

cross-platform Rust API for memory mapped IO
Apache License 2.0
531 stars 157 forks source link

Expose a low-level API in lib.rs #20

Closed ghost closed 8 years ago

ghost commented 8 years ago

I'd love to use this library, but currently it's doing too much for my particular use case. In particular, I'd like to:

danburkert commented 8 years ago

mmap any offset in a file to a precise location in memory (I know what page boundaries are, I'm ok with handling errors). Basically, any extra checking is going to cost a lot in some of my use cases.

Are you asking for MAP_FIXED or are you concerned about the page size checks we do under the covers?

flush and unmap whenever I want (the order in which I do it are fundamental to my use case).

This is already possible. Mmap::flush corresponds to msync, and dropping the Mmap corresponds to munmap

ghost commented 8 years ago

Yep, MAP_FIXED + no page size checks would definitely be enough.

Then if the Mmap structure is not much larger than the size of a pointer, I'd be happy.

danburkert commented 8 years ago

I'm not against adding MAP_FIXED, but removing the page size checks is sufficiently against the spirit of this crate that I probably won't consider it (unless you can come up with a good way to make it only happen optionally). The reason the page size checks are there on construction and destruction are so that the offset does not need to be stored in the struct. How often are you creating new memory maps?

ghost commented 8 years ago

I definitely understand and respect the purpose of your crate, which is why I initially planned to call mmap directly from my crate. Actually, I think there is a way. You could split your current functions, and provide core unsafe functions, marking them unsafe, like:

pub unsafe fn unsafe_memmap(...) -> Memmap {
  some unsafe things
}

pub fn memmap(...)->Memmap {
  check stuff
  unsafe { unsafe_memmap(...) }
}
danburkert commented 8 years ago

The issue is that you are returning the same type, and so it must have the same destructor. There is no way to handle destruction safely at that point without either having the offset in the Mmap struct, or doing what it does currently. I am extremely skeptical that you will ever see the page_size stuff come up in a profile, especially with all the issues that the Linux mmap semaphore has. Happy to be proven wrong, though.

danburkert commented 8 years ago

Also if it does show up, you could add a fast path for when the offset is 0 in the constructor function. Still no way to remove the check from the destructor, though. Again, none of this matters in the grand scheme of how expensive it is to create a mmap.

ghost commented 8 years ago

Right! Which, I realize now, is why I was also asking to call munmap myself! I store the offset in a struct in my crate. What I'm writing is a B-tree in an mmaped file, which means that I need to handle offsets and lengths myself, with a relatively low level of detail.

I'd be happy to have even a really unsafe function, returning a raw pointer that I need to munmap myself. My main constraint is portability.

danburkert commented 8 years ago

Hmm OK I'm not seeing the issue. I've quickly skimmed the code in http://pijul.org/sanakirja/src/transaction.rs, but I can't see why you wouldn't be able to use Mmap as it's written now. Instead of handling the offset yourself when accessing the mapped memory, let the Mmap type handle it. You can unmap a Mmap by dropping it, so you have full control over that.

ghost commented 8 years ago

Ok, I can try. The thing I want is not yet implemented in Sanakirja, I wanted to map stuff in a way that might not be too portable. I see how I could use you stuff with just the support for MAP_FIXED.

reem commented 8 years ago

Not sure if I should add to this issue or make another, let me know if you'd rather I move this to another one.

I also need some low-level control for a library I am working on; in particular, I need:

Like @pijul I would also like the option to do all page alignment checks myself.

I think the best way to accomplish these goals is to expose another RawMmap type with a near wholly unsafe API that can be used to build abstractions that require extremely low level control and which performs as few additional operations over mmap/munmap/msync/etc. as possible.

danburkert commented 8 years ago

I see how I could use you stuff with just the support for MAP_FIXED

,

ability to map using MAP_FIXED

Yah, I think MAP_FIXED should be added, but I may not be able to do it immediately (PRs very welcome).

ability to munmap only parts of a previous mmap

I don't think this is possible on Windows, from reading the UnmapViewOfFile docs. Not sure what your usecase is, but perhaps it would be enough to madvise away the pages you don't need? I don't know if madvise bindings exist anywhere (would be a nice addition to this crate), but the cross-platform concerns are less because it's just a hint.

I think the best way to accomplish these goals is to expose another RawMmap type with a near wholly unsafe API that can be used to build abstractions that require extremely low level control and which performs as few additional operations over mmap/munmap/msync/etc. as possible.

The goal of this crate has been to expose the lowest common denominator interface that works across platform. The support for non-page aligned offsets is pretty much the only 'feature' provided that doesn't map 1 to 1 with a system call, and that's because it is essentially zero overhead to provide it.

reem commented 8 years ago

Ah, you're right about munmapping only a region of an mmap not being portable to windows - I guess I will need to write my own platform specific bindings since I really need this feature.

ghost commented 8 years ago

Alright, I've found a workaround and changed my API for something simpler. I'd just need one low-level thing, though: an unsafe function to flush a range with a non-mutable borrow on the mmap.

I know I should use views for what I want to do, but the API is a bit too complex for my use case (i'm basically writing a concurrent allocator of pages in an mmaped file).

danburkert commented 8 years ago

@pijul that's an interesting usecase that tends to come up a lot. It was the reason I added the unsafe clone to the view type (so I could unsafely clone the view, and then flush from another thread). Perhaps flush shouldn't require a mutable borrow of the mmap at all? I don't think that would be a problem WRT Rust's safety guarantees.

ghost commented 8 years ago

You're right, it's actually not unsafe, since it doesn't change anything to the program's memory. Even if the mmap is mutable borrowed in the same lifetime, it doesn't change the contents of the borrows.

danburkert commented 8 years ago

Just published version 0.3 with flush methods taking a const reference. I'm created #21 to track the MAP_FIXED issue, but again I probably will not be implementing it myself (as I have no need). I'm going to close this issue out, but if I missed any concerns, feel free to reopen or file new issues.