JuliaIO / JLD2.jl

HDF5-compatible file format in pure Julia
Other
549 stars 85 forks source link

Fail-safe way to write into a jld2 file in parallel? #391

Closed jarroyoe closed 2 years ago

jarroyoe commented 2 years ago

I have a script that writes an Vector{Any} into a .jld2 file inside a for loop to get partial results of the runs. I'm trying to implement parallel processing using Threads, and while most of the time it works well, in my best run so far I got the following error:

ERROR: LoadError: TaskFailedException
Stacktrace:
 [1] wait
   @ ./task.jl:334 [inlined]
 [2] threading_run(func::Function)
   @ Base.Threads ./threadingconstructs.jl:38
 [3] top-level scope
   @ ./threadingconstructs.jl:97

    nested task error: ArgumentError: attempted to truncate a file that was already open
    Stacktrace:
     [1] jldopen(fname::String, wr::Bool, create::Bool, truncate::Bool, iotype::Type{JLD2.MmapIO}; fallback::Type{IOStream}, compress::Bool, mmaparrays::Bool)
       @ JLD2 ~/.julia/packages/JLD2/DcnTD/src/JLD2.jl:266
     [2] jldopen(fname::String, mode::String; iotype::Type, kwargs::Base.Pairs{Symbol, Bool, Tuple{Symbol}, NamedTuple{(:compress,), Tuple{Bool}}})
       @ JLD2 ~/.julia/packages/JLD2/DcnTD/src/JLD2.jl:342
     [3] jldopen(::JLD2.var"#56#57"{Base.Pairs{Symbol, Vector{Any}, Tuple{Symbol}, NamedTuple{(:df,), Tuple{Vector{Any}}}}}, ::String, ::Vararg{String}; kws::Base.Pairs{Symbol, Any, Tuple{Symbol, Symbol}, NamedTuple{(:compress, :iotype), Tuple{Bool, DataType}}})
       @ JLD2 ~/.julia/packages/JLD2/DcnTD/src/loadsave.jl:2
     [4] #jldsave#55
       @ ~/.julia/packages/JLD2/DcnTD/src/loadsave.jl:242 [inlined]
     [5] macro expansion
       @ ~/Invasibility/runs.jl:104 [inlined]
     [6] (::var"#46#threadsfor_fun#17"{UnitRange{Int64}})(onethread::Bool)
       @ Main ./threadingconstructs.jl:85
     [7] (::var"#46#threadsfor_fun#17"{UnitRange{Int64}})()
       @ Main ./threadingconstructs.jl:52
in expression starting at /home/jarroyoe/Invasibility/runs.jl:32

From what I understand, it seems that two threads tried to overwrite the file at the same time, which caused an error. Is there any way to prevent the threads from doing this?

JonasIsensee commented 2 years ago

Hi @jarroyoe , JLD2 does not support writing to the same file from multiple threads at the same time. If each of your threads does a lot of independent work and only write results sometimes, you could use a lock as shown below. (Now all other threads will wait for each other when writing a file) Otherwise you could have threads write to separate files (and combine later)

julia> Threads.nthreads()
8

julia> filelock = ReentrantLock()
ReentrantLock(nothing, Base.GenericCondition{Base.Threads.SpinLock}(Base.InvasiveLinkedList{Task}(nothing, nothing), Base.Threads.SpinLock(0)), 0)

julia> using JLD2

julia> filename = "test_threaded.jld2"
"test_threaded.jld2"

julia> Threads.@threads for n=1:1000
                # do expensive computation
                lock(filelock) do 
                      jldopen(filename, "a") do f
                            f["$n"] = n 
                      end
                end
         end

julia> f = jldopen(filename)
JLDFile /home/isensee/test_threaded.jld2 (read-only)
 β”œβ”€πŸ”’ 126
 β”œβ”€πŸ”’ 876
 β”œβ”€πŸ”’ 251
 β”œβ”€πŸ”’ 877
 β”œβ”€πŸ”’ 252
 β”œβ”€πŸ”’ 626
 β”œβ”€πŸ”’ 253
 β”œβ”€πŸ”’ 127
 β”œβ”€πŸ”’ 627
 └─ β‹― (991 more entries)
jarroyoe commented 2 years ago

Thank you! Is there any significant reduction in performance of using locks, say the threads stop when one of them is writing, or the waiting time only occurs when two threads reach the writing at the same time?

El jue., 24 de marzo de 2022 1:23 a. m., JonasIsensee < @.***> escribiΓ³:

Hi @jarroyoe https://github.com/jarroyoe , JLD2 does not support writing to the same file from multiple threads at the same time. If each of your threads does a lot of independent work and only write results sometimes, you could use a lock as shown below. (Now all other threads will wait for each other when writing a file) Otherwise you could have threads write to separate files (and combine later)

julia> Threads.nthreads() 8

julia> filelock = ReentrantLock() ReentrantLock(nothing, Base.GenericCondition{Base.Threads.SpinLock}(Base.InvasiveLinkedList{Task}(nothing, nothing), Base.Threads.SpinLock(0)), 0)

julia> using JLD2

julia> filename = "test_threaded.jld2" "test_threaded.jld2"

julia> @.*** for n=1:1000

do expensive computation

            lock(filelock) do
                  jldopen(filename, "a") do f
                        f["$n"] = n
                  end
            end
     end

β€” Reply to this email directly, view it on GitHub https://github.com/JuliaIO/JLD2.jl/issues/391#issuecomment-1077358689, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALD7U5J5L2WAY6WE7CMGMB3VBQRAJANCNFSM5RPFJJOQ . You are receiving this because you were mentioned.Message ID: @.***>

JonasIsensee commented 2 years ago

Yeah, in principle the waiting should only occur when two threads attempt to write at the same time. If writing happens rarely that may be best. However, this depends a lot on the application and I suggest to try and benchmark it yourself. :)