HIPS / autograd

Efficiently computes derivatives of NumPy code.
MIT License
7k stars 912 forks source link

Sparse matrix support #80

Open wcbeard opened 8 years ago

wcbeard commented 8 years ago

Is there any way to use autodiff on sparse scipy matrices? The docs say np.dot isn't supported (and attempting it throws a ValueError: setting an array element with a sequence.). I can get the dot product with h.dot(W1), but autograd doesn't work that way according to #8.

Is there any known workaround for dealing with sparse matrices, or anything planned?

Thanks for the great library!

mattjj commented 8 years ago

For proper support, we'd probably have to wrap the functions in scipy.sparse, analogous to the way we wrap scipy.linalg (and numpy everything else, check out autograd/scipy/ for examples). We may also need to introduce a new Node type to wrap sparse matrix instances. I don't think sparse support is on any of our task lists right now, so this issue might remain open for a while.

namednil commented 7 years ago

Did somebody start working on that? Sparse matrix support would make this really cool project even cooler as this would make it suitable for a variety of new tasks, especially for natural language processing.

ericmjl commented 6 years ago

I'm working on sparse arrays in my CuPy branch at the moment. If you'd like to come contribute, please ping me.

ericmjl commented 6 years ago

Just hit a roadblock with sparse arrays in CuPy. Switching over to SciPy sparse.

twhughes commented 6 years ago

@ericmjl Any luck with scipy.sparse since your last comment :D?

ericmjl commented 6 years ago

@twhughes alas, I lack a strong linear algebra background. Do you know how to define the derivative of a dot product? That is mostly where I was stuck on... before I switched to using PyTorch, which enabled me to do what I needed to do for work.

twhughes commented 6 years ago

I believe so, but I get TypeError: Can't differentiate w.r.t. type <class 'scipy.sparse.csr.csr_matrix'> when I try to implement and this seems like it could be a deeper issue unfortunately.

I'm not sure how to change autograd to support sparse matrices

ericmjl commented 6 years ago

@twhughes check my fork: https://github.com/ericmjl/autograd/tree/scipy-sparse/autograd/scipy/sparse

The thing that you might need to do here is to create a box for sparse matrices, I think.

If you're up for it, I have most of the tooling (I think) available for scipy sparse, except for the derivative. Maybe you could do a PR into my fork?

twhughes commented 6 years ago

@ericmjl That sounds perfect! Thanks! I'll take a look

twhughes commented 6 years ago

@ericmjl I implemented a gradient for spsolve(). You can see it here. https://github.com/twhughes/autograd/blob/scipy-sparse/spsolve_test.py
(just run python spsolve_test.py) I'm having issues when casting or creating sparse matrices from within a function. I would like to make a function that takes in a numpy array, adds this to the diagonal of a sparse matrix, and spsolve()s the result. But I get some errors when I try to construct a csc_matrix from within the function.

TypeError: no supported conversion for types: (dtype('O'),)

I can look later into incorporating some gradients for dot, spsolve and other functions into the branch on your fork

ericmjl commented 5 years ago

@twhughes I finally found some time at work to do some digging around. Apparently, this is because an array of ArrayBoxes has dtype 'O'. This array of ArrayBoxes gets created somewhere in the gradient computation, I think, but I'm not 100% sure of that. I have raised an issue here: https://github.com/HIPS/autograd/issues/459

I think it's fixable, somehow, but I'm not quite sure whether the fix has to happen in SciPy or in autograd.

ericmjl commented 5 years ago

@twhughes I have found another sparse array library, aptly called "sparse", developed with the goal of being a numpy API-compatible sparse library beyond our usual 2-dimensions. Have been digging around trying to make it work with autograd, I have a feeling it's almost there. Do you have a good mental model of what's going on in the internals of autograd? Would you be open for a call sometime? Happy to share some code.

maedoc commented 1 year ago

DAE just need the sparse matrix vector multiply? for a generic linear operator w/ @ syntax,

@primitive
def generic_mm(A, x):
    return A @ x

defvjp(generic_mm, None, lambda ans, spm, x: lambda g: g @ A)

seems to work, e.g.

A = np.random.rand(50,50)
A[A<0.9] = 0.0
sA = scipy.sparse.csr_matrix(A)
x = np.random.randn(50)

gloss1 = grad(lambda x: np.sum(np.dot(A,x)))
gloss2 = grad(lambda x: np.sum(generic_mm(sA,x)))

onp.testing.assert_allclose(gloss1(x), gloss2(x))
tylerflex commented 1 year ago

@maedoc this is great, but defines the VJP with respect to the x in Ax=b.

It's been a while, but I think what we were after originally was rather a way to define the vjp w.r.t. the elements of A. For example, to be able to grad something like:

lambda x, y: np.dot(A(y), x))

where the sparse matrix A is a function of y.

To this end, I got some VJP primitives working in my electromagnetic simulation library "ceviche" where I consider defining VJPs involving a sparse matrix A that is created as a function of its "entires" (raw values) and "indices" (integer indices into the non-zero elements). Note the VJP wrt indices is not defined as it makes little sense to do so.

In the primitives.py file, I defined various VJPs for operations like np.dot(A, x) and spsolve(A, b) w.r.t. both x and the entries of A, in case this is useful for anyone reading this!