Open keewis opened 4 years ago
another way to get this to work is to explicitly register types that can be wrapped by a certain duck array (see e.g. dask
's implementation from #6393).
This is probably the simplest and most explicit way, but we still need to figure out where to put the actual registration (however, since we don't have that many duck arrays, yet, it might be fine to ignore this for now).
cc @jthielen, @TomNicholas, @shoyer, @amcnicho (if I remember correctly you were interested in this?)
Discussion on this topic (and hopefully converging efforts towards a resolution) is welcome at https://github.com/pydata/duck-array-discussion/issues/3!
Part of the discussion in #5329. I'm opening this issue to make the discussion on this topic a bit more focused.
As mentioned in https://github.com/dask/dask/issues/5329#issuecomment-691992501, we need to find a way to
pint(dask(numpy))
is chosen overdask(pint(numpy))
)Parts of this have already been discussed in #6393.
To solve both of these issues, the type hierarchy from hgrecco/pint#845 could be used, but we'd still need to figure out how to compare within that hierarchy, and we'd probably have to maintain a package that collects the relationships between different packages.
Similarly, we could have duck arrays maintain a list of duck arrays they can wrap. This is still pretty static and might grow too much for packages that are fairly high in the type hierarchy, but would allow to granularly control the interaction with other duck arrays.
In #6393, it was suggested to divide duck arrays into categories and then have duck arrays in categories with a higher number take care of those in a category with a lower number. However, this breaks down as soon as you have duck arrays that belong to multiple categories (or have two duck arrays that belong to the same category wrap each other), and adding new categories is difficult (at least for numeric numbers).
Using that idea, we could have duck arrays declare a tuple of categories they belong to and then a tuple of categories they can wrap. We could then compute a set operation to decide which is wrapped, but this still breaks down for coarse categories (duck arrays in the same category wrapping each other) and circular graphs (i.e. A is in categories
x
andz
and can wrapy
whileB
is in categoryy
and can wrapx
) – not sure if that's an issue?