JuliaIO / JLD.jl

Saving and loading julia variables while preserving native types
MIT License
277 stars 55 forks source link

writing with Threads.@threads #261

Open scottstanie opened 5 years ago

scottstanie commented 5 years ago

When I use jld write in a loop, it seems to ruin the ability of Threads.@threads to distribute work. This might be a Julia 1.3 bug, but reporting it here first:

At first, the threading works (here I have set JULIA_NUM_THREADS=56 on a big machine):

Threads.@threads for i = 1:10
    println("i = $i on thread $(Threads.threadid())")
end
i = 4 on thread 4
i = 10 on thread 10
i = 2 on thread 2
i = 1 on thread 1
i = 9 on thread 9
i = 7 on thread 7
i = 5 on thread 5
i = 6 on thread 6
i = 3 on thread 3
i = 8 on thread 8
function f1()
    function innerwrite(layer, i)
          jldopen("testwrite$i.jld", "w") do f
              write(f, "layer", layer)
              println("writing testwrite$i.h5 on thread $(Threads.threadid())")
          end
    end
    rr = rand(800, 800, 10);
    Threads.@threads for i = 1:size(rr, 3)
        layer = rr[:, :, i]
        innerwrite(layer, i)
    end
end
julia> f1()
writing testwrite1.h5 on thread 1
writing testwrite2.h5 on thread 1
writing testwrite3.h5 on thread 1
writing testwrite4.h5 on thread 1
writing testwrite5.h5 on thread 1
writing testwrite6.h5 on thread 1
writing testwrite7.h5 on thread 1
writing testwrite8.h5 on thread 1
writing testwrite9.h5 on thread 1
writing testwrite10.h5 on thread 1

Now the weirder part:

julia> Threads.@threads for i = 1:10
     println("i = $i on thread $(Threads.threadid())")
 end
i = 1 on thread 1
i = 2 on thread 1
i = 3 on thread 1
i = 4 on thread 1
i = 5 on thread 1
i = 6 on thread 1
i = 7 on thread 1
i = 8 on thread 1
i = 9 on thread 1
i = 10 on thread 1

So not only does the JLD not save in parallel, it makes the former simple example from the blog post https://julialang.org/blog/2019/07/multithreading fail

julia> versioninfo()
Julia Version 1.3.0-alpha.0
Commit 6c11e7c2c4 (2019-07-23 01:46 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, broadwell)
Environment:
  JULIA_NUM_THREADS = 56
JeffBezanson commented 5 years ago

We'll have to add locks around the calls to hdf5.

nlw0 commented 2 years ago

I seem to be getting segfaults from JLD.jl when saving some computation results from within a Threads.@threads for loop. Could it be related to this? Tested on 1.6.3 and 1.7.0.