JuliaFolds2 / ChunkSplitters.jl

Splitting collections into chunks
https://juliafolds2.github.io/ChunkSplitters.jl/
MIT License
50 stars 7 forks source link

Improve docs #14

Closed carstenbauer closed 1 year ago

carstenbauer commented 1 year ago
codecov[bot] commented 1 year ago

Codecov Report

Merging #14 (e82cb75) into main (869a039) will not change coverage. The diff coverage is n/a.

@@           Coverage Diff           @@
##             main      #14   +/-   ##
=======================================
  Coverage   96.66%   96.66%           
=======================================
  Files           1        1           
  Lines          60       60           
=======================================
  Hits           58       58           
  Misses          2        2           
Files Coverage Δ
src/ChunkSplitters.jl 96.66% <ø> (ø)

:mega: Codecov offers a browser extension for seamless coverage viewing on GitHub. Try it in Chrome or Firefox today!

carstenbauer commented 1 year ago

If CI passes, this is ready to be merged IMHO. (cc @lmiq)

lmiq commented 1 year ago

Thank you very much!

One thing only: is there a compelling argument for using map/do/fetch instead of @sync and a simple loop? I find the use of map much harder to understand there, particularly in what the chunks and ranges mean in those examples.

carstenbauer commented 1 year ago

One thing only: is there a compelling argument for using map/do/fetch instead of @sync and a simple loop? I find the use of map much harder to understand there, particularly in what the chunks and ranges mean in those examples.

I'd argue that the map approach (1) cleanly highlights the data-parallel character, (2) discourages mutation (which is generally to be avoided in parallel algorithms), and, as a consequence of (2), naturally avoids potential race conditions. A simple loop unfortunately incentivises mutation (of a preallocated buffer) and, if done the wrong way (as is easily the case and also was the case before this PR), leads to issues like false sharing or - even worse - race conditions.

Note that the Julia docs themselves have a "good" parallel sum implementation that is also written in terms of map and is virtually identical to what I present here.

(Having said all that, loops are still fine, of course, if you know what you're doing.)

lmiq commented 1 year ago

Retrospectively I think chunks should return only the index ranges, and to get the chunk number one should use enumerate . That would make the use of map cleaner.