pangeo-data / rechunker

Disk-to-disk chunk transformation for chunked arrays.
https://rechunker.readthedocs.io/
MIT License
163 stars 25 forks source link

Use finer-grained task dependencies to avoid global barrier #70

Open rabernat opened 3 years ago

rabernat commented 3 years ago

Just making a note of an idea I had the other day.

Right now our graphs look something like this:

image

The big global barrier in the middle is a bit unfortunate, as all the workers end up blocking on the slowest intermediate write task. It would be nice if some target write tasks could get started already.

This should be fairly easy to accomplish. We could set the dependencies between intermediate and target tasks more specifically. We just have to work out the mapping between write tasks and read tasks.