oconnor663 / bao

an implementation of BLAKE3 verified streaming
Other
479 stars 23 forks source link

a non-allocating (in the steady state) parallel hasher #14

Closed oconnor663 closed 6 years ago

oconnor663 commented 6 years ago
oconnor663 commented 6 years ago

Update, this isn't too bad with reusable jobs and capacity(1) channels for getting the result back. The last source of allocation is that e.g. rayon::spawn does a heap allocation every time you spawn a task. We would need some kind of dedicated thread pool to avoid that.

oconnor663 commented 6 years ago

Doing memory mapping + rayon::join is a lot simpler, and also faster. The benefits of having a parallel Write interface seem a little niche by comparison given how tricky it is to implement.

oconnor663 commented 6 years ago

The main downside of this approach is complexity, but the second downside is that it requires dedicated threads. Rayon boxes all the closures that you spawn into it, since they're all different sizes. Avoiding that box means that the threads have to run hardcoded functions, which rules out worker threads that belong to any kind of shared thread pool like Rayon's.

Given that the whole goal of this design is to avoid overhead, but that any non-memory-mapping design necessarily pays a lot of copying overhead, there doesn't seem to be much value here. (Also, empirically, the additional allocation overhead seems to be pretty low when I've tried this.)