Quansight-Labs / uarray

A general dispatch and override mechanism for Python.
https://uarray.org/
BSD 3-Clause "New" or "Revised" License
102 stars 26 forks source link

Heirarchical domains #189

Closed peterbell10 closed 4 years ago

peterbell10 commented 5 years ago

As discussed in https://github.com/scipy/scipy/pull/10383/commits/7acf8ba43485a76d720f6e62d4e510a897131eff#issuecomment-510888182

The domain a.b.c should be treated as a root domain a with sub-domains a.b and a.b.c. This means that calling a function in the leaf domain (a.b.c) should call backends registered to any of the super-domains, walking up the tree.

I think it would also be useful to be able to disable this on a per sub-domain basis so that the end-user can have more control over the backends being called and so they can fail-fast when not calling into the backend they expect.

hameerabbasi commented 5 years ago

By the user, do you mean the author of a given backend (the person who writes __ua_function__) or the API author (the person who writes the multimethods)? A simple solution would be to expose the domain on the multimethod or C-level _Function object. I can do this if needed.

hameerabbasi commented 5 years ago

Or do you mean immediately writing said code for hierarchical domains? I can do that too.

peterbell10 commented 5 years ago

By user, I mean the person installing backends and calling multimethods.

Edit: "End-user" in the uarray documentation.

hameerabbasi commented 5 years ago

How about a ua.skip_domain(domain: str, subdomains: bool=True) context manager?

hameerabbasi commented 5 years ago

I'd still prefer the correct method: Don't depend on the name of the multimethod, it's a flaky decision at best. Also check the __module__ within __ua_function__.

peterbell10 commented 5 years ago

I'd still prefer the correct method: Don't depend on the name of the multimethod

This has nothing to do with it, it's about giving the end-user more control over what backend gets called. e.g. if I install the pyfftw backend to numpy.scipy.fft, I might want to make absolutely sure that it's the one being called and not the scipy default.

hameerabbasi commented 5 years ago

This has nothing to do with it, it's about giving the end-user more control over what backend gets called. e.g. if I install the pyfftw backend to numpy.scipy.fft, I might want to make absolutely sure that it's the one being called and not the scipy default.

This is already possible with the set_global_backend(..., only=True) and set_backend(..., only=True), but as I understand currently you'd like to limit backends to those with domains of numpy.scipy.fft?

peterbell10 commented 5 years ago

This is already possible with the set_global_backend(..., only=True)

If you do that then registered backends aren't usable.

hameerabbasi commented 5 years ago

Currently subdomains have exact matching, so there is no point in such functionality... Unless you'd like a dummy function that'll gain this functionality once hierarchical domains are implemented?

peterbell10 commented 5 years ago

Currently subdomains have exact matching, so there is no point in such functionality...

That's why it's in the same issue :)

hameerabbasi commented 5 years ago

Okay. We'll add this once hierarchical domains are implemented.

hameerabbasi commented 5 years ago

cc @peterbell10 @rgommers

The reason I want numpy backends to be tried before numpy.fft for example:

The reason I'd like this over a list of domains:

peterbell10 commented 5 years ago

dask.array ought to become a numpy backend. It should take precedence over MKL FFT

I don't think that they should be in competition with each other. If you pass in a dask array to the mkl_fft backend it should return NotImplemented and then the dask backend would be called anyway.

In your example, I think numpy.fft should be tried first because I am assuming that a more specialised backend would only be installed if it was in some way preferred over the less specialised version e.g. MKL FFT should be preferred over the default numpy FFTs because it's faster. If it's done as you suggest then installing mkl_fft and the default numpy backend wouldn't ever call into mkl_fft.

hameerabbasi commented 5 years ago

In your example, I think numpy.fft should be tried first because I am assuming that a more specialised backend would only be installed if it was in some way preferred over the less specialised version e.g. MKL FFT should be preferred over the default numpy FFTs because it's faster. If it's done as you suggest then installing mkl_fft and the default numpy backend wouldn't ever call into mkl_fft.

Fair enough... You've convinced me. 😄

hameerabbasi commented 5 years ago

@peterbell10 In light of the discussion we had yesterday, would it be better to just reverse the order of walking the tree? i.e. root to leaf, rather than leaf to root? It seems that's the cleanest solution here.

peterbell10 commented 5 years ago

To be clear, when you say "root to leaf" do you mean e.g. try numpy first then numpy.fft? I thought we had agreed on the other way round.

hameerabbasi commented 5 years ago

I racked my brain and I can't seem to find a situation where trying numpy.fft first would be useful.

peterbell10 commented 5 years ago

What about mkl_fft?

hameerabbasi commented 5 years ago

That should be tried ideally after CuPy, Dask, or PyData/Sparse.

peterbell10 commented 5 years ago

But it will never be tried because it would just call the numpy backend.

rgommers commented 5 years ago

If mkl_fft is tried first, it will anyway say NotImplemented for CuPy/Dask/Sparse arrays right? So it's fine to try mkl_fft first? More specialized to less specialized makes sense to me as the general principle here.

hameerabbasi commented 4 years ago

I thought about this a bit more, and decided it's best that the user decided the order. So, inner with statements take precedence over outer with statements in all cases, but what this issue will change is the range of the backend being applied. So numpy would apply to numpy.*, numpy,scipy would apply to numpy.* and scipy.*.

hameerabbasi commented 4 years ago

If mkl_fft is tried first, it will anyway say NotImplemented for CuPy/Dask/Sparse arrays right?

This is not strictly true, CuPy coerces whenever it can IIUC.

peterbell10 commented 4 years ago

If mkl_fft is tried first, it will anyway say NotImplemented for CuPy/Dask/Sparse arrays right?

This is not strictly true, CuPy coerces whenever it can IIUC.

I don't think so, no. In normal usage CuPy doesn't coerce at all. In the scipy.fft backend in particular, it always returns NotImplemented from __ua_convert__.

I see that the unumpy backend doesn't return NotImplemented though:

https://github.com/Quansight-Labs/unumpy/blob/96d707188e4f9e1cf0b19f8a18e1685efd595dd5/unumpy/cupy_backend.py#L30

That seems like a bug. Otherwise you can't use unumpy to dispatch to numpy and cupy arrays at the same time.

peterbell10 commented 4 years ago

it's best that the user decided the order. So, inner with statements take precedence over outer with statements in all cases

We still need to make a decision for global and registered backends. Those don't have a defined ordering and time of registry is not strictly controlled by the user.

hameerabbasi commented 4 years ago

We still need to make a decision for global and registered backends. Those don't have a defined ordering and time of registry is not strictly controlled by the user.

It should follow the same rule -- No guarantee for registered backends, and a bug to rely on it. Global backends -- Submodules before their parents.

hameerabbasi commented 4 years ago

I don't think so, no. In normal usage CuPy doesn't coerce at all. In the scipy.fft backend in particular, it always returns NotImplemented from __ua_convert__.

Does it do this for numpy.ndarray as well? If so we should follow the same behaviour here.

peterbell10 commented 4 years ago

I'll be more explicit: __ua_convert__ returns NotImplemented if any of the arrays are not cupy.ndarray. Otherwise, it passes through the arguments unchanged.

peterbell10 commented 4 years ago

Unless coerce=True, of course

hameerabbasi commented 4 years ago

I'll be more explicit: __ua_convert__ returns NotImplemented if any of the arrays are not cupy.ndarray. Otherwise, it passes through the arguments unchanged.

In this case we can change unumpy to do this and then adopt the order of going specialized -> parent everywhere.

hameerabbasi commented 4 years ago

don't think so, no. In normal usage CuPy doesn't coerce at all. In the scipy.fft backend in particular, it always returns NotImplemented from __ua_convert__.

I see that the unumpy backend doesn't return NotImplemented though:

https://github.com/Quansight-Labs/unumpy/blob/96d707188e4f9e1cf0b19f8a18e1685efd595dd5/unumpy/cupy_backend.py#L30

That seems like a bug. Otherwise you can't use unumpy to dispatch to numpy and cupy arrays at the same time.

I fixed this in Quansight-Labs/unumpy#55.