JuliaIO / JLD2.jl

HDF5-compatible file format in pure Julia
Other
537 stars 84 forks source link

Unknown Parallel Issue #478

Closed ejmeitz closed 11 months ago

ejmeitz commented 11 months ago

Hello,

When running my code I get a rather cryptic message (see below). My code works in serial and crashes in parallel. Reading through the message it does not really look like and issue with parallel read that I added a few weeks back as one of the lines it references is a write to a file. I'm not really sure what to do to fix this error though as its unclear what exactly the problem is.

I see #282 and I might be running out of RAM but I'd be kind of surprised. It does seem to be intermittent as my jobs were half completed before this error ran and all of the earlier jobs wrote to disk fine.

Pseudocode:

@sync for i in 1:10
    @async begin
         for j in 1:10
            functions that call JLD2
          end
      end
end

The message:

[12703] signal (7.2): Bus error
in expression starting at /home/emeitz/scripts/main.jl:44
__memmove_ssse3_back at /lib64/libc.so.6 (unknown line)
unsafe_copyto! at ./array.jl:238 [inlined]
unsafe_write at /home/emeitz/.julia/packages/JLD2/cHcDY/src/mmapio.jl:206 [inlined]
unsafe_write at ./io.jl:685 [inlined]
raw_write at /home/emeitz/.julia/packages/JLD2/cHcDY/src/dataio.jl:123
write_data at /home/emeitz/.julia/packages/JLD2/cHcDY/src/dataio.jl:137 [inlined]
write_dataset at /home/emeitz/.julia/packages/JLD2/cHcDY/src/datasets.jl:537
jfptr_write_dataset_3553 at /home/emeitz/.julia/compiled/v1.9/JLD2/O1EyT_ce0CW.so (unknown line)
write_dataset at /home/emeitz/.julia/packages/JLD2/cHcDY/src/inlineunion.jl:44
unknown function (ip: 0x7effb00c18e4)
write_dataset at /home/emeitz/.julia/packages/JLD2/cHcDY/src/inlineunion.jl:36
#write#120 at /home/emeitz/.julia/packages/JLD2/cHcDY/src/compression.jl:137
write at /home/emeitz/.julia/packages/JLD2/cHcDY/src/compression.jl:125 [inlined]
write at /home/emeitz/.julia/packages/JLD2/cHcDY/src/compression.jl:125 [inlined]
setindex! at /home/emeitz/.julia/packages/JLD2/cHcDY/src/groups.jl:125 [inlined]
setindex! at /home/emeitz/.julia/packages/JLD2/cHcDY/src/JLD2.jl:477 [inlined]
#28 at /home/emeitz/.julia/packages/ModalAnalysis/MCsIL/src/NMA.jl:118
unknown function (ip: 0x7effb01513d2)
#jldopen#72 at /home/emeitz/.julia/packages/JLD2/cHcDY/src/loadsave.jl:4
jldopen at /home/emeitz/.julia/packages/JLD2/cHcDY/src/loadsave.jl:1 [inlined]
NMA_loop at /home/emeitz/.julia/packages/ModalAnalysis/MCsIL/src/NMA.jl:116
unknown function (ip: 0x7effb00d7976)
macro expansion at /home/emeitz/.julia/packages/TimerOutputs/RsWnF/src/TimerOutput.jl:237 [inlined]
run at /home/emeitz/.julia/packages/ModalAnalysis/MCsIL/src/NMA.jl:63
unknown function (ip: 0x7effb00b0996)
macro expansion at /home/emeitz/scripts/main.jl:57 [inlined]
#9 at ./task.jl:514
unknown function (ip: 0x7effb00799ff)
jl_apply at /home/emeitz/software/julia/src/julia.h:1879 [inlined]
start_task at /home/emeitz/software/julia/src/task.c:1092
Allocations: 8092807600 (Pool: 8069468923; Big: 23338677); GC: 19004
Bus error
ejmeitz commented 11 months ago

Oh I literally am out of disk space, that's not a super useful error message from julia lol.

I tried to 'st' in the Pkg manager and it said: IOError: open("/home/emeitz/.julia/logs/manifest_usage.toml.pid", 194, 292): no space left on device (ENOSPC)