TImeDistributed + Add Layers Error

keras-team / keras

Deep Learning for humans

http://keras.io/

Apache License 2.0

61.7k stars 19.43k forks source link

TImeDistributed + Add Layers Error #20093

Open marta-q opened 1 month ago

marta-q commented 1 month ago

Using TensorFlow==2.17.0 , Keras==3.4.1 I am having issues when trying to use the Add Layer together with the TImeDistributed one:

X = TimeDistributed(Add(), name='add_residual_convolution_' + str(it))([X, X_residual])

ValueError: `TimeDistributed` Layer should be passed an `input_shape` with at least 3 dimensions, received: [(None, 12, 0, 2), (None, 12, 0, 2)]

I have also tried passing the input_shape=X.shape argument to TimeDistributed, but the same error appears.

How can I solve this?

ghsanti commented 1 month ago

The time distributed layer expects 4 dimensions (frames, width, height, channels), here there are two inputs with 3 dimensions, which is what the error tells.

Do you really need the time distributed layer here? If you are just trying to add the tensors, Add is all you need.

marta-q commented 1 month ago

Hmm they have 4 dimensions each: (None, 12, 0, 2) This was just an example with wrong dimensions, the third dimension isn't really zero and the error still happens with the actual values. I was just trying to reproduce the code from someone else, this supposedly worked with keras and tensorflow 2.11.0.

ghsanti commented 1 month ago

None is potentially the batch dimension, not a dimension from a single input sample; each input-sample has to be 4 dimensions, with the batch it's 5.

marta-q commented 1 month ago

When I pass only X (same exact dimensions) with a different layer type, that works, so the issue here is not the dimension of the individual tensors, but that TimeDistributed does not have a proper implementation for using the Add() layer. So it's necessary to either implement an error saying Add() layer is not accepted by TimeDistributed or fix TimeDistributed so that it accepts a list of tensors when Add() layer is used.

I was just trying to see if anyone could give me a workaround for this...

ghsanti commented 1 month ago

You are right, actually TimeDistributed accepts 3D, according to the docs:

Every input should be at least 3D

marta-q commented 1 month ago

So far I've changed it to this:

X = Add()([X, X_residual])
X = TimeDistributed(Dense(2), name='add_residual_convolution_' + str(it))(X)

But I don't think this is achieving exactly the same 🤔 You said in the beginning that using only Add() could work the same, but I'm not sure I get how, could you try to explain?

ghsanti commented 1 month ago

You said in the beginning that using only Add() could work the same, but I'm not sure I get how, could you try to explain?

From what I understand (from the docs.) the layer TimeDistributed really is just to iterate an operation over an extra dimension. So in the case of Conv2D it allows you to iterate over video data (where you imagine that extra dimension in the input as time.)

In this case it seems to me that the layer is not necessary, and you just need Add()([a,b]) (but I'm just another user, and haven't really had to deal with that layer before.)

Maybe @sachinprasadhs has better advice.

marta-q commented 1 month ago

I see, thanks for your help!