data-apis / array-api

RFC document, tooling and other content related to the array API standard
https://data-apis.github.io/array-api/latest/
MIT License
205 stars 42 forks source link

Include `__array_namespace__()` on `dtype` objects? #704

Closed Zac-HD closed 8 months ago

Zac-HD commented 8 months ago

Per https://github.com/pydata/xarray/pull/8404#discussion_r1382343801, there are cases where this would allow us to generate a compatible array.

asmeurer commented 8 months ago

I'm a little unclear how you end up with a dtype object but don't have the namespace in a hypothesis context. Doesn't hypothesis require you to specify the namespace to begin with?

More generally, though, I suppose there could be instances where you have a function that only takes a dtype as an argument, although I'm not aware of this coming up in any actual library yet.

Zac-HD commented 8 months ago

Xarray would like to have a variables() strategy, which takes a strategy to generate dtypes and an array_strategy_fn (a function which takes a dtype and returns a strategy to generate an array). If dtypes knew their array-namespace, then we could automatically infer the correct array_strategy_fn without the user needing to provide one.

I agree it's a niche use-case, but thought it was worth asking about.

rgommers commented 8 months ago

This doesn't seem like the best idea to me. Not only would it cause a non-negligible amount of work, it would have issues for any library with dtypes that are not part of the array API standard or (like NumPy) allow creating and registering third-party dtypes.

Can this be solved with an Xarray-specific method/hook somehow?

Zac-HD commented 8 months ago

Let's round this to "doesn't seem to be worth solving" and move on!

asmeurer commented 8 months ago

Also Dask literally just reuses the NumPy dtypes.