a non-allocating (in the steady state) parallel hasher

oconnor663 commented 6 years ago

an Arc<Condvar> on the state for blocking the caller
a threshold rule for waking a blocked caller only after enough items have been freed, maybe half
some sort of circular buffer with atomic read and write cursors?

oconnor663 commented 6 years ago

Update, this isn't too bad with reusable jobs and capacity(1) channels for getting the result back. The last source of allocation is that e.g. rayon::spawn does a heap allocation every time you spawn a task. We would need some kind of dedicated thread pool to avoid that.

oconnor663 commented 6 years ago

Doing memory mapping + rayon::join is a lot simpler, and also faster. The benefits of having a parallel Write interface seem a little niche by comparison given how tricky it is to implement.

oconnor663 commented 6 years ago

The main downside of this approach is complexity, but the second downside is that it requires dedicated threads. Rayon boxes all the closures that you spawn into it, since they're all different sizes. Avoiding that box means that the threads have to run hardcoded functions, which rules out worker threads that belong to any kind of shared thread pool like Rayon's.

Given that the whole goal of this design is to avoid overhead, but that any non-memory-mapping design necessarily pays a lot of copying overhead, there doesn't seem to be much value here. (Also, empirically, the additional allocation overhead seems to be pretty low when I've tried this.)

oconnor663 / bao

a non-allocating (in the steady state) parallel hasher #14