keras-team / keras

Deep Learning for humans
http://keras.io/
Apache License 2.0
62.13k stars 19.49k forks source link

Sparse Tensors #18420

Open adi-kmt opened 1 year ago

adi-kmt commented 1 year ago

Hey @fchollet,

do you have any roadmap to implement sparse tensors for the different backends?

fchollet commented 1 year ago

The general plan is to let backend ops receive and return backend-native sparse tensor types. E.g. backend.tensorflow.numpy.matmul should be able to receive sparse inputs, in which case it would return sparse outputs.

Then, when passed sparse data inputs (e.g. via tf.data or via scipy sparse arrays) we would not densify them and just pass them to the backend ops.

Lastly, we would avoid densify gradients in the optimizer (which we currently do).

The most work will be enabling all backend ops with sparse tensors support.

We should do this for the TensorFlow backend first, as it has the most mature support for sparse tensors so far.

jackd commented 1 year ago

I'd be willing to do a lot of the dogs-work to make this happen (e.g. write backend wrappers, keras Operations) if some framework for composite tensors (or even just a SparseTensor class) could be established.

fchollet commented 1 year ago

@hertschuh is currently working on this -- but there may be items you guys can take on!

ghsanti commented 1 week ago

Hi!

Are there any updates for other-than-tf support for sparse tensors @fchollet ?

hertschuh commented 1 week ago

Hi!

Are there any updates for other-than-tf support for sparse tensors @fchollet ?

@ghsanti ,

Sparse support has been added for JAX using jax.experimental.sparse.BCOO specifically. Feature-wise, it's at parity with TensorFlow for most things (gradients is the most notable gap).

ghsanti commented 1 week ago

Hi!

Are there any updates for other-than-tf support for sparse tensors @fchollet ?

@ghsanti ,

Sparse support has been added for JAX using jax.experimental.sparse.BCOO specifically. Feature-wise, it's at parity with TensorFlow for most things (gradients is the most notable gap).

I need the sparse gradients (not dense) as that would otherwise defeat its purpose? (Memory would go up, speed down.)

With this constraint on gradients, is tf the only backend possible?

Yet, neither sparse Graph convolutions nor sparse convolutions are available tf layers.

"Graph" ones seem implementable in a layer using smth like $\sigma (AW)$ since keras supports sparse matmul through tf, right? And get the sparse gradient if I get your point correctly?

hertschuh commented 5 days ago

@ghsanti

JAX handles sparseness in gradients differently, so there's nothing we can do in Keras right now. That makes TF the only possible backend. Sparse convolutions are indeed not available in Tensorflow. Sparse matmul does produce sparse gradients, so you could try that.