JuliaDataCubes / YAXArrays.jl

Yet Another XArray-like Julia package
https://juliadatacubes.github.io/YAXArrays.jl/
Other
99 stars 16 forks source link

Distributed filling a datacube after change to DimensionalData #295

Open TabeaW opened 1 year ago

TabeaW commented 1 year ago

It used to work to do cube[time=datetime] in a @distributed for loop before the breaking change to DimensionalData. Now it should work with cube[Ti=At(datetime)], but it hangs and does nothing. Without the @distributed it just works nicely. Any idea what might goes wrong?

felixcremer commented 1 year ago

What error and stacktrace do you get when you stop the execution with ctrl+C?

TabeaW commented 1 year ago
Stacktrace:
 [1] try_yieldto(undo::typeof(Base.ensure_rescheduled))
   @ Base ./task.jl:920
 [2] wait()
   @ Base ./task.jl:984
 [3] wait(c::Base.GenericCondition{Base.Threads.SpinLock}; first::Bool)
   @ Base ./condition.jl:130
 [4] wait(c::Base.GenericCondition{Base.Threads.SpinLock})
   @ Base ./condition.jl:125
 [5] _wait(t::Task)
   @ Base ./task.jl:308
 [6] sync_end(c::Channel{Any})
   @ Base ./task.jl:404
 [7] top-level scope
   @ task.jl:477
TabeaW commented 1 year ago

No I am a bit confused. I tried it with a MWE and it seems to work nicely, so it must be a different problem! But I am facing something (for me) unexpected, but maybe I am completely wrong in the usage:

@everywhere axlist=(Dim{:time}(DateTime(2020,1,1):Day(1):DateTime(2020,1,5)),Dim{:x}(range(1,10,length=10)))
@everywhere cube=YAXArray(axlist,rand(5,10))
@sync @distributed for datetime=DateTime(2020,1,1):Day(1):DateTime(2020,1,5)
        @show cube[time=At(datetime)].data, datetime
        cube[time=At(datetime)].data.=day(datetime)
        end

After the first run, @show in the loop shows the random values, as expected, but in the end if you show the cube again, the cube didn't change at all. But if you run the loop again, the @show in the loop shows the right values, so it must have changed something

TabeaW commented 1 year ago

And also a bit strange for me. If you call the dim :Ti and not :time the cube[Ti=At(datetime)] results in the Warning

┌ Warning: (Ti,) dims were not found in object
└ @ DimensionalData.Dimensions ~/.julia/packages/DimensionalData/pS9IE/src/Dimensions/primitives.jl:659
TabeaW commented 1 year ago

The last one needs to be cube[Ti(At(datetime)] or cube[time=At(datetime)]

TabeaW commented 1 year ago

Thought I solved it, but same error occurs. Another one is that for chunked data cubes, it seems that some chunks are missing.

TabeaW commented 1 year ago
@everywhere datetimes=DateTime(2020,1,1):Day(1):DateTime(2020,1,5)
pmap(1:5,distributed=true) do i
    @show cube[time=At(datetimes[i])].data, datetimes[i]
    cube[time=At(datetimes[i])].data.=day(datetimes[i])
end

does not work

@everywhere datetimes=DateTime(2020,1,1):Day(1):DateTime(2020,1,5)
pmap(1:5,distributed=false) do i
    @show cube[time=At(datetimes[i])].data, datetimes[i]
    cube[time=At(datetimes[i])].data.=day(datetimes[i])
end

works as expected