Create an `Op` for NumPy's generalized `ufunc`s

brandonwillard commented 2 years ago

We need to extend our existing ufunc support—currently provided by the Elemwise Op—with a new Op that covers NumPy's gufuncs. Let's call this proposed Op Blockwise.

This new Blockwise Op would allow us to generalize at least a few existing Ops, and it would provide a very direct bridge to Numba and JAX's equivalent functionality (e.g. Numba's direct support for gufuncs and JAX's vmap). It would also allow us to implement a very convenient np.vectorize-like helper function.

The implementation details behind a Blockwise Op will likely involve a lot of the same logic as Elemwise and RandomVariable. At a high level, the gradient methods in the former could be extended to account for non-scalar elements, while the latter demonstrates some of the relevant shape logic.

The RandomVariable Op already works similarly to gufuncs, because it supports generic "base" random variable "op"s that map distinctly shaped inputs to potentially non-scalar outputs. A good example is the MultinomialRV Op; its gufunc-like signature would be (), (n) -> (n).

Here's an illustration:

import numba
import numpy as np

import aesara.tensor as at

#
# Existing `gufunc`-like functionality provided by `RandomVariable`
#
X_rv = at.random.multinomial(np.r_[1, 2], np.array([[0.5, 0.5], [0.4, 0.6]]))
X_rv.eval()
# array([[0, 1],
#        [2, 0]])

#
# A NumPy `vectorize` equivalent (this doesn't create a `gufunc`)
#
multinomial_vect = np.vectorize(np.random.multinomial, signature="(),(n)->(n)")
multinomial_vect(np.r_[1, 2], np.array([[0.5, 0.5], [0.4, 0.6]]))
# array([[1, 0],
#        [2, 0]])

#
# A Numba example that creates a NumPy `gufunc`
#
@numba.guvectorize([(numba.int64, numba.float64[:], numba.int64[:])], "(),(n)->(n)")
def multinomial_numba(a, b, out):
    out[:] = np.random.multinomial(a, b)

multinomial_numba(np.r_[1, 2], np.array([[0.5, 0.5], [0.4, 0.6]]))
# array([[1, 0],
#        [1, 1]])

See the originating discussion here.

kc611 commented 2 years ago

One way I see this Op being implemented is by having it store an Function as a property (similar to how Elemwise does with it's atomic functions) and then calling it upon the Numpy arrays while looping over inputs in Op.perform. (The signature and relevant shape information can be supplied and processed during __init__)

Otherwise if we want the inner FunctionGraph to be 'a part' of the outer FunctionGraph, that'd require an interface similar to Scan.

brandonwillard commented 2 years ago

One way I see this Op being implemented is by having it store an Function as a property (similar to how Elemwise does with it's atomic functions) and then calling it upon the Numpy arrays while looping over inputs in Op.perform. (The signature and relevant shape information can be supplied and processed during __init__)

~This approach would preclude us from providing a np.vectorize-like interface for Ops.~ If I understand what you're saying correctly, that's effectively the same as the approaches mentioned below.

Otherwise if we want the inner FunctionGraph to be 'a part' of the outer FunctionGraph, that'd require an interface similar to Scan.

Yes, it would be similar to a narrower, streamlined Scan, or an expanded Elemwise, or an OpFromGraph with looping.

kc611 commented 2 years ago

Built a rudimentary implementation: https://gist.github.com/kc611/72732f4305fe273c2da1620d6fb4e90c

We should discuss how/at which level are we going to check for input shapes, without the actual shape information as integer tuples, it'll be very hard to determine if inputs are valid or not until we hit Op.perform where we do have that information.

Any way in which https://github.com/aesara-devs/aesara/pull/711 can help us out here ?

brandonwillard commented 2 years ago

Built a rudimentary implementation: https://gist.github.com/kc611/72732f4305fe273c2da1620d6fb4e90c

Yes, that's a good start, especially the use of the HasInnerGraph interface.

To parse the NumPy signatures, we can use the same code NumPy uses (e.g. numpy.lib.function_base._parse_gufunc_signature and the like). My guess is that we could adapt most—if not all—of the basic logic used by np.vectorize, but I believe we already have (in numerous places and different ways).

Now, I think it's important that we leave the ufunc signatures out of the Op-level logic, and only use it as a convenient/familiar interface, because we'll need to focus on implementing a new Op interface to replace RandomVariable.[ndim_supp, ndims_params] and Elemwise.nfunc_spec (not the NumPy function name part, though).

This interface is really the first step. From there, the work is in Op.infer_shape and Op.grad/Op.L_op/Op.R_op.

We should discuss how/at which level are we going to check for input shapes, without the actual shape information as integer tuples, it'll be very hard to determine if inputs are valid or not until we hit Op.perform where we do have that information.

There's nothing we can—or really need—to do about that. That's the nature of this type of work: we simply do what we can with the available static information and leave the run-time-level checking to the run-time.

Any way in which #711 can help us out here ?

Yes, but now we're talking about static analysis-based optimizations. Those are great, but they come later down the line.

brandonwillard commented 2 years ago

@kc611, I just created an outline for the kind of Op that might work. You're welcome to take over that PR if you're interested in implementing this; otherwise, we can use it as a reference until either I have the time or someone else wants to take this on.

kc611 commented 2 years ago

I'll give it a try.

aesara-devs / aesara

Create an `Op` for NumPy's generalized `ufunc`s #695