dask / dask-expr

BSD 3-Clause "New" or "Revised" License
79 stars 18 forks source link

Add first array draft #1090

Closed fjetter closed 1 week ago

fjetter commented 1 week ago

This is an attempt to get https://github.com/dask/dask-expr/pull/471 passing.

Apart from missing implementation, I encountered a lot of trouble in the assertion code. For example

https://github.com/dask/dask/blob/36e9d7c84c619abcfc498e3fdf7b37f7fb8fd400/dask/array/utils.py#L240

is accessing a key of a HLG and is expecting this to resolve to the key in the low level graph, see custom getitem logic https://github.com/dask/dask/blob/36e9d7c84c619abcfc498e3fdf7b37f7fb8fd400/dask/highlevelgraph.py#L508-L528 This works... at times. Especially, the later part of the graph expects this result, i.e. the low level graph value to be an actual array (since it was persisted a couple of lines further above) which breaks since we have to attach a dummy layer in the FromGraph (to avoid name collisions). Ideally, this would use a "Get Nth chunk" API but I'm not sure if that exists.

There are likely plenty of other problems around here but this is what I found so far

phofl commented 1 week ago

lets merge this