Open rabernat opened 4 years ago
(Unfortunately I can't find the docs on those dunder methods.)
I think https://docs.dask.org/en/latest/custom-collections.html has what we want.
I thought briefly about having that method return a dask.Array that would read from the eventual files. But that doesn't feel quite right, since .compute()
would have a strange meaning (does it mean kick off the computation, or bring the results into memory, or both?)
I thought briefly about having that method return a dask.Array that would read from the eventual files
:-1:
I think we should focus exclusively on storing data here. We should make it clear in the docs how to read the target data, but shouldn't actually return it.
Should we implement a custom collection?
You might try just subclassing Delayed
first,
class RechunkingPlan(Delayed):
@property
def source_chunks(self):
...
def _repr_html_(self):
...
I'm not 100% sure if this would work, but it'll be less effort than implementing a custom collection.
Right now we just return a delayed object. I think instead we should return an object called a
RechunkingPlan
. This object could expose useful parameters plus implement the dask dunder methods to allows us to write code like(Unfortunately I can't find the docs on those dunder methods.)
We could have an html repr with a table with information like
etc.