jonhoo / flurry

A port of Java's ConcurrentHashMap to Rust
Apache License 2.0
525 stars 49 forks source link

Support rayon parallel iterators #14

Open jonhoo opened 4 years ago

jonhoo commented 4 years ago

We should implement rayon parallel iterators. The notes and code in iter/plumbing may be helpful, and the hashbrown implementation too.

cuviper commented 4 years ago

I'm happy to answer rayon questions on this.

jonhoo commented 4 years ago

@cuviper Since you're offering, one thing I was actually wondering was whether we can somehow take advantage of the fact that the map supports fully concurrent access. Since non-concurrent maps like hashbrown are also able to implement the parallel traits, it wasn't immediately clear to me whether how you get a win from flurry in rayon even though it really feels like it should be possible.

jonhoo commented 4 years ago

I suspect maybe the answer has to do with some of the traits in iter/plumbing, but not sure?

cuviper commented 4 years ago

Concurrent insert will make a naive ParallelExtend and FromParallelIterator trivial, just something like par_iter.for_each(|(key, value)| { map.insert(key, value); }). Maybe there's a more advanced approach that could improve performance, like folding into separate bins and then reducing into the final map, but I don't know your data structure enough to evaluate that.

Parallel iterators are a bit harder -- you need a strategy for splitting the map into separate "slices"/"views" of some sort. The hashbrown implementation should be a good reference if you look at how they use RawIterRange::split internally.

jonhoo commented 4 years ago

Ah, I see, hashbrown is forced for first reduce and then use one core to build the map, whereas we don't have to do that. Neat!

Stupremee commented 4 years ago

I would like to give this a try.

jonhoo commented 4 years ago

All yours!

Stupremee commented 4 years ago

I need some help. Whats the best way to split a Table into two halves?

jonhoo commented 4 years ago

The best way is probably to just split the list of bins in two: the "high" bins and the "low" bins.

Stupremee commented 4 years ago

I have no idea how to approach this. I think it's better if I let someone else do it.