keras-team / keras

Deep Learning for humans
http://keras.io/
Apache License 2.0
61.71k stars 19.43k forks source link

Ragged Tensors #18414

Open PatReis opened 1 year ago

PatReis commented 1 year ago

Hello,

thanks for the wonderful work on Keras-Core. I saw the issue keras-team/keras#18420 and keras-team/keras#18467 and wanted to ask about the ideas/roadmap to support ragged tensors. Will there be a possibilty to pass a KerasRaggedTensor to layers and models, of course without the support of ragged operations across backends ? To become compatible with tf.keras ? Just to feed a ragged shaped tensor to model or layers without having to resort to padding... I know that this is probably a lot of work, so I just wanted to know in order to plan ahead for myself.

Best regards

fchollet commented 1 year ago

Report from the other thread:

We might support RaggedTensor in the future with the TF backend specifically (such a feature does not exist with other frameworks).

However if you're just looking to have a dynamic data dimension, then you can do it just by:

In general it is possible to handle any workflow using rectangular tensors. RaggedTensors are a convenience but not a blocker.

PatReis commented 1 year ago

@fchollet Thank you very much, you can close this issue then.

Just one last questions:

So my worries are that back then, you were not able to feed a keras model inputs of shape let's say (32, 128) and (64, 64, 64) but they need all to have the same first (batch) dimension, which I think is related to distributed training. For me, my question was only related to the input batch dimension. I personally do not need ragged operations or the actual features of the ragged tensors.

However, I must check whether keras-core behaves still the same.

So it was very handy to use ragged tensor for model input to get a fixed batch dimension but otherwise flexible shape. Do you think it would be possible to have as a minimum a Input(ragged=True) layer that takes ragged input and can be used as graph entry but then just returns the inner parts as keras tensors like values and nested splits? Because for example with tf.data the .ragged_batch(batch_size) was very handy to compile ragged tensor input.

But yes, I understand that you can use padded or bucketed data, which is fine I guess, but a little overhead even if decomposed in the first layer.

swamidass commented 11 months ago

I also need ragged arrays in keras, or equivalent functionality to feed awkward data. This is critical not out for NLP but also for graph networks.

The key pain point is that the fit method comes with a strong assumption that all the input arrays contain data for all the input examples, partitioned by the first axis. But ragged arrays encoded break this a

It's true that only tf technically supports ragged. But keras could and should support a limited implementation of a cross-backend composite array type that stores a ragged (batch, ragged_dim, ...) array a row_length array (batch, ragged_shape) and a values array (value_id, ...). Using this encoding the GNN libraries (see DGL, jraph, tensorflow_gnn) have been able to get a workable situation.

Without having this implemented directly in keras-level composite array type, it would be very difficult to get this work because of the strong shape assumptions being made by keras.

Now perhaps there is already a way to make keras-level composite arrays? If so, maybe there is a work around?

PatReis commented 11 months ago

@swamidass So I am currently trying to port my old keras graph library (https://github.com/aimat-lab/gcnn_keras/tree/master) to keras 3.0. And I decided to use disjoint implementation of PyTorch Geometric as main graph representation. I think jraph and DGL use it too. This can be realized as there is not restriction on tensors between layers. It seems to work to pass ragged tensors to keras 3.0 models already. Although a ragged kwarg in Input layer would be great for backward compatibilty. So I ended up to simply decompose them in the first layer and continue with normal disjoint tensors (tested in tensorflow only yet). For jax however you would have to use a loader either PyTorch data loader or tf.data to load disjoint with padded and fixed size into the model. I have not tested this yet. For Jax, bucketing and padding is the only way I think. But I believe that padded disjoint with a dummy graph at padded index would not yield a great performance reduction.

swamidass commented 11 months ago

For Jax, it is easiest because there are no constraints. For performance reasons, you do want to pad to consistent sizes, and Jraph has a simple function to accomplish this. Nonetheless, the input tensors have different sizes leading dimensions regardless.

@PatReis, how are you managing loading batches into keras fit? Or are you jut avoiding keras fit and writing your own training function?

PatReis commented 10 months ago

@swamidass No, I am still working on the port to keras 3 and everything is experimental at the moment. But here is an example on how you could realize loading with keras fit and future kgcnn package:

https://github.com/aimat-lab/gcnn_keras/blob/master/docs/source/models.ipynb or https://github.com/aimat-lab/gcnn_keras/blob/master/notebooks/tutorial_model_loading_options.ipynb

You can not really use ragged tensors becaus of ops.convert_to_tensor() but you can disassemble ragged tensors in the first layer. That I think works.

shkarupa-alex commented 6 months ago

+1 to restore ragged tensors support

swamidass commented 3 months ago

What is the current status of keras and ragged tensors?

CtrlShanya commented 2 months ago

I would also like to know, especially for ragged tensor support in Conv layers!