crossbeam-rs / crossbeam

Tools for concurrent programming in Rust
Apache License 2.0
7.29k stars 459 forks source link

Concurrent map interface #109

Open Vtec234 opened 7 years ago

Vtec234 commented 7 years ago

We need to figure out a safe and useful interface for concurrent maps that all implementations in crossbeam would then provide. This is my proposal of how it could look.

First, the basic functions mirroring the ones in the standard HashMap:

fn insert(&self, k: K, v: V) -> Option<V>;
fn contains_key<Q: ?Sized>(&self, k: &Q) -> bool where K: Borrow<Q>, Q: Hash + Eq;
fn remove<Q: ?Sized>(&self, k: &Q) -> Option<V> where K: Borrow<Q>, Q: Hash + Eq;
fn clear(&self);
fn len(&self) -> usize;
fn is_empty(&self) -> bool;

All methods obviously take an immutable &self.

Notice that there is no element access. The standard get returns a borrow Option<&V> and we cannot ever safely provide that since other threads may remove it in the meantime. Now, it is possible to provide a function like:

fn borrow<Q: ?Sized>(&self, k: &Q) -> Option<Guard<V>> where K: Borrow<Q>, Q: Hash + Eq;

where Guard<V> would provide immutable borrows of the value and maybe a possibility to mutate it (a somewhat similar idea to Entry). I can think of two ways to do this: The first would be to make Guard pin the epoch on creation and unpin it on destruction. However, since leaking destructors is safe, leaking a single Guard would disallow recollection of any memory from the pinning thread since the pin would stay on forever. (The same is true of mem::epoch::Guard, but we expect to only use it inside algorithms which are highly scrutinized due to the unsafe unlinked calls.) This is therefore most likely a terrible idea. The second way would be to use reference counting (probably Arc) internally. This is pointless, since instead we can provide the following interface:

fn find<Q: ?Sized>(&self, k: &Q) -> Option<V> where K: Borrow<Q>, Q: Hash + Eq, V: Clone;

I used a different function name to signify the difference from single-threaded maps. The user can then just set V to Arc<T> and have shared access to T. Hence, returning Guards only makes sense if it can be done without pinning the epoch or reference counting and in a faster way.

One thing that's neat and easily doable here is (@DemiMarie's idea) a closure that operates on the borrow:

fn with<Q: ?Sized, F>(&self, k: &Q, f: F)
  where K: Borrow<Q>,
        Q: Hash + Eq,
        F: FnOnce(Option<&V>);
...
map.with("key", |opt| println!("is key present? {}", opt.is_some()); );

Internally this would pin the epoch, avoiding the problem I outlined above by not allowing arbitrary epoch guard leakage.

Then there are the various iterator APIs: Values, Keys, Iter, Drain. Providing them would give the concurrent map great expressive power. However, there is the small problem of everything being able to change concurrently, invalidating iterators. I think the smartest move here is to gather information on what would be actually useful and viable in this context instead of mirroring the standard library.

Finally, there is the important problem of atomic transactions. In a single-threaded map a user might do something like:

// Guarantees that 0 maps to "zero" iff it didn't already map to something else.
if (!map.contains_key(0)) {
  map.insert(0, "zero");
}

For concurrent maps, doing this would require wrapping the structure in something like Mutex to guarantee that two threads don't both modify the value. This of course nullifies the whole gain from lock-free algorithms. Now, we might provide functions for often-used operations like the above one, maybe

/// Inserts a key-value pair iff the key wasn't observed in the structure.
/// Returns `true` if it succeeds, `false` otherwise.
fn insert_new(&self, key: K, val: V) -> bool;

In general though, it's inviable to implement every single thing people might need. There is, however, a promising approach (unfortunately paywalled) to this problem that works for list-based implementations.

It would be great to then provide a functional Combinator API that would aggregate arbitrary operations like Iterators do and execute them in a single atomic transaction on destruction, maybe like so:

map.combine().contains_or(0, |access| { access.insert(0, "zero"); } );

This would still be somewhat limited to what can be called on the Combinator and complex boolean conditions might or might not be expressible, but it's a beginning. It's definitely possible to think of something much nicer, but in the end it will always be a bit awkward to use, since to execute the transaction we need knowledge of each and every its invariant. Procedural macros might help here, but internally will still expand to long function chains. I would be happy to be proven wrong here.

Lastly, it may be worth it to have a generic trait, something like ConcurrentMap<K, V> expressing such an interface. Notice however, that the atomic transaction implementation I linked above only applies to linked structures. Hash maps could use STM or such, but I'm unsure whether it's a good idea to force them to provide the transaction API (depending on whether it impacts the performance of the whole structure, not just the relevant methods). We could also have similar interfaces for other structures such as queues, as @schets suggests in #56.

I would be happy to hear any feedback you might have on this.

ghost commented 7 years ago

I believe crossbeam should provide really performant, composable data structures with straightforward APIs. Transactions would be a nice feature, but they're a really tough problem and might be best to tackle in a separate crate.

As you demonstrated, borrowing content from shared data structures is tricky. I'm not fond of fn borrow giving out borrows to entries owned by the map. I'd much rather clearly separate ownership by providing fn find that returns clones of entries. This makes APIs simpler and doesn't enforce a particular (possibly suboptimal) memory management solution on users of crossbeam.

Finer grained memory management, transactional layer etc. should be handled by users or other crates. Concurrency is very hard itself even without these problems.

Also, it would be very interesting to see some particular real-world use cases for concurrent maps. This would make API modelling easier. Unfortunately, I'm out of ideas and have only very niche use cases in mind, like database indexes. Perhaps someone has other ideas? I'd also be very interested to hear about use cases for ordered concurrent maps.

Vtec234 commented 7 years ago

Transactions would be a nice feature, but they're a really tough problem and might be best to tackle in a separate crate.

The problem with providing transaction support in another crate is that it has to be integrated into the map implementation, i.e. the algorithms must take transactions into account, there are additional members to be added into internal list Nodes, etc. There is no way to write a crate on top of crossbeam that would add this feature to the maps (unless it used a lock of course, but that's silly). Doing that would therefore entail simply writing a new implementation of the map in the other crate. That's fine, but personally I'd be happier with less Rust ecosystem fragmentation.

What might be sensible to do is provide this in a second trait by having a ConcurrentMap<K, V> for the methods that all map implementations support and ConcurrentTransactionalMap<K, V> including only the transaction API. We'd then have HashMap<K, V>: ConcurrentMap<K, V> and ListMap<K, V>: ConcurrentMap<K, V> + ConcurrentTransactionalMap<K, V>. This also elevates the problem of forcing transaction support on map implementations that can't support it easily.

Having said this, this is still a distant feature. We have to have actual working structures first to think about transactions, so it's not terribly important right now.

Also, it would be very interesting to see some particular real-world use cases for concurrent maps. This would make API modelling easier. Unfortunately, I'm out of ideas and have only very niche use cases in mind, like database indexes. Perhaps someone has other ideas? I'd also be very interested to hear about use cases for ordered concurrent maps.

I'd imagine the primary use to be distributed and highly parallel systems. Databases often run on such. Personally I wanted a map for a resource management system in a game engine. For this, the basic API is all I need. For databases, transaction support is important. I agree it's good to consider use case requirements.

ghost commented 7 years ago

There is no way to write a crate on top of crossbeam that would add this feature to the maps (unless it used a lock of course, but that's silly).

For databases, transaction support is important.

Actually, AFAIK, modern MVCC databases decouple index data structures (concurrent maps) from the transactional layer. Concurrent maps are therefore only concerned with really fast insert/update/delete, while transactional information is then stored within the data itself, rather than being stored as metadata within the index. You don't need silly locks to implement transactions this way on top of crossbeam.

Another thing I'm worried about is that implementing transactions within crossbeam might force a certain uncomposable crossbeam-specific transactions scheme on the user.

In any case, at least we agree this is something we shouldn't think about too much at the moment :)

Vtec234 commented 7 years ago

Sorry, I should've been more specific. What I meant is that there is no way to implement that particular transaction implementation (link again) on top of an existing structure. The concern is that what the authors refer to as "transactional boosting" is generally quite a bit slower than support baked-into the structure, as their benchmarks show (although they did not benchmark MVCC).

You are right to be concerned about how the scheme is designed, but I hope it's possible to make it general enough to be usable in various contexts. I am by no means an expert on databases though, so there may be something I'm missing that makes this impossible.

schets commented 7 years ago

On the topic of iterators getting invalidated concurrently - the CHAMT datastructure can efficiently support snapshots, allowing non-owning iterators to take a consistent view of the map. This doesn't really solve the transaction problem but it's a step in the right direction, and I wouldn't be surprised if one could bake support for transactions into the datastructure using a mechanism similar to the snapshot. Owning iterators like drain seem impossible, since you can't ever really move a value out of the map without some guarantee that nothing else tried to read the datastructure.

Here's a repo with the source, it's not too documented but could help.

I like the idea of only exposing values in the map through a closure, it seems harder to accidentally leak a values or spend forever with the value pinned that way.

Crossbeam also needs to get the ability to drop items after free for any sort of map to work generally.

AlisdairO commented 7 years ago

Even with closures I think it'll need very clear explanation to ensure people understand that they risk massive memory blowups if the closure fails to complete quickly. Seems like a bit of a footgun compared to cloning - I'd tend to favour at least providing cloning methods as well.

Crossbeam also needs to get the ability to drop items after free for any sort of map to work generally.

This seems to have been an issue for quite some time - does anyone know if there's been any progress on a fix? My (very unexciting but otherwise functional) concurrent map is stalled on this.

schets commented 7 years ago

I was working on the drop thing a while ago, I remember that there was something about requiring types with a static lifetime, but I haven't thought much at all about it recently. Aside from ensuring lifetimes are correct it's not difficult to add to crossbeam, only a few lines of code in the freeing function.

In general with this sort of scheme cloning, viewing, return guards all have the footgun aspect since each runs arbitrary code. I think we should provide clone interfaces when it works but otherwise provide a closure-based implementation so types don't have to be Clone.

ghost commented 7 years ago

I was working on the drop thing a while ago, I remember that there was something about requiring types with a static lifetime

This is why I strongly believe that garbage should not be tied to threads, but rather to every concurrent data structure. Then every data structure could free all the garbage it produced as soon as it is dropped.

AlisdairO commented 7 years ago

Fair point, I hadn't considered that clones might have long-running code in them too!

schets commented 7 years ago

The dissertation from the author of the transactional list paper seems quite relevant here.

Vtec234 commented 7 years ago

with closures I think it'll need very clear explanation to ensure people understand that they risk massive memory blowups if the closure fails to complete quickly. clones might have long-running code in them too!

It's a good idea to explain the risks w.r.t. running time in the with(..) docs. I think it's rare for cloning to take long for typical key types. Using Vec for keys doesn't sound reasonable either way. Still, no harm in mentioning the risk here, too.

[with(closure)] Seems like a bit of a footgun compared to cloning - I'd tend to favour at least providing cloning methods as well.

Yes, we definitely want to provide both.

This is why I strongly believe that garbage should not be tied to threads, but rather to every concurrent data structure.

The problem with requiring 'static doesn't really have anything to do with threads, it would still exist in sequential execution. It's that the destructor runs after an arbitrary time delay and so the data cannot refer to anything non-'static (local), since that would be unsound.

Having per-structure garbage bags might be profitable for another reason, though - when there are many structures, a long-running operation in one wouldn't prevent reclamation from another.

Then every data structure could free all the garbage it produced as soon as it is dropped.

I don't think that's terribly important, since concurrent structures are not meant to be created and dropped often.

I've got a working implementation of a map that leaks anything keys own, but is correct for keys that have trivial destructors. I agree we need deferred destruction before adding any maps.

All the linked papers about snapshots and transactions are very interesting and will definitely be of great benefit for this library, but we'll have to wait with that until after we get the basics right.

Vtec234 commented 7 years ago

So @stjepang made me aware of some issues I hadn't considered. Firstly, my proposal is only for a certain type of map - hash map. The other type of map we might want to support are ordered maps. For completeness, the basic interface for a hash map would look like:

fn insert(&self, k: K, v: V) -> Option<V>;
fn contains_key<Q: ?Sized>(&self, k: &Q) -> bool where K: Borrow<Q>, Q: Hash + Eq;
fn find<Q: ?Sized>(&self, k: &Q) -> Option<V> where K: Borrow<Q>, Q: Hash + Eq, V: Clone;
fn with<Q: ?Sized, F>(&self, k: &Q, f: F)
  where K: Borrow<Q>,
        Q: Hash + Eq,
        F: FnOnce(Option<&V>);
fn remove<Q: ?Sized>(&self, k: &Q) -> Option<V> where K: Borrow<Q>, Q: Hash + Eq;
fn clear(&self);
fn len(&self) -> usize;
fn is_empty(&self) -> bool;

Then, for ordered maps we'd have the same thing, but with PartialOrd instead of Hash. There are some subtle issues w.r.t. how a wrong ordering might give raise to unsoundness/use-after-free in concurrent lists by violating invariants, but I think they are resolvable without resorting to unsafe methods or traits.

Additionaly, there are a lot of differences between particular map implementations. Some of them might put different requirements on key/value types, not expose certain/expose additional functionality, etc. This makes me wonder whether it might be better to not have a single trait trying to unify them all and instead just expose somewhat similar, but appropriately differing interfaces for each implementation. The answer to this should come out in practice.

AlisdairO commented 7 years ago

but is correct for keys that have trivial destructors.

Curious - what did you do to make things correct for values? My implementation currently leaks on nontrivial destructors for both, and also for resizing - I use fixed-size Vecs that get replaced.

Having per-structure garbage bags might be profitable for another reason, though - when there are many structures, a long-running operation in one wouldn't prevent reclamation from another.

Definitely agreed, if it's feasible. In the context of a large program, this would probably be highly beneficial for debuggability - has a kind of spooky-action-at-a-distance feel to it as things stand.

Vtec234 commented 7 years ago

Curious - what did you do to make things correct for values? My implementation currently leaks on nontrivial destructors for both, and also for resizing - I use fixed-size Vecs that get replaced.

What structure is your map based on? The implementability really depends on that. Mine is a skiplist and these never need resizing or moving stuff around, so values either get moved out for the user to drop or when the whole structure is dropped they can be immediately destroyed, since drops are single-threaded. Keys can't get moved out though, so they leak owned resources.

On the other hand, for hash arrays there should be a whole lot of leakage due to resizing.

AlisdairO commented 7 years ago

Ah, I see - yes, mine's a hash array.