JuliaFolds / Transducers.jl

Efficient transducers for Julia
https://juliafolds.github.io/Transducers.jl/dev/
MIT License
432 stars 24 forks source link

`dcollect` fails where `pmap` works #525

Open tecosaur opened 2 years ago

tecosaur commented 2 years ago

Hi, I have an expensive computation function (uses a few GiB memory while running, takes a few minutes, puts a reasonable amount of pressure on GC). So, this lends itself to distributed Julia.

Unfortunately, trying to run this with Transducer's dcollect simply doesn't work.

withprogress(hpo_terms, interval=0.1) |>
    Map(do_varpp_noerror) |>
    m -> dcollect(m, basesize=1, threads_basesize=1)

However, if I convert this to a pmap it executes successfully.

pmap(do_varpp_noerror, hpo_terms)

I can't remember the exact characteristics of dcollect not working (I switched to pmap a few weeks ago and am just opening this issue to let you know), but IIRC not a single iteration of do_varpp_noerror finished executing after many minutes, and memory usage was much higher than that observed using pmap.