Open vchuravy opened 5 years ago
Maybe it would make more sense to follow that path in https://github.com/JuliaParallel/Dagger.jl? While it can be useful to consider a larger part of the execution graph it also vastly complicates the implementation and also makes it much harder for the user to reason about performance.
I consider this issue rather speculative, but we do need somewhere to issue operations without them necessarily blocking on each other.
CuArray stream interface is particularly interesting, but relies on a global order.
One of the current issues with DArray is that each operation is immediately synchronizing. Requiring a distributed operation to finish before we can carry on with scheduling new operations. This simplifies the design, but limits the scalability. Ideally we would want operations on DArray to be async/lazy similar to how CuArray works, and only synchronize on
show
andconvert
.The major design issue here is to guarantee consistency. Operations need to appear to have executed in-order, even though we might want to be able to execute reads out-of-order, but we will have to deal with updating data in-place. One idea might be to use vector clocks or look into how Fractal handles this or to run a consensus protocol to establish operations that can commit.