How ConvLSTM2D is learning the spatio-temporal information from individual image?

rezazad68 / BCDU-Net

BCDU-Net : Medical Image Segmentation

712 stars 264 forks source link

How ConvLSTM2D is learning the spatio-temporal information from individual image? #40

Closed kiranshahi closed 2 years ago

kiranshahi commented 2 years ago

As you have mentioned ConvLSTM2D is used for Spatio-temporal information.

But I am confused, is this model really going to learn the temporal information? There are two reasons for it.

Your datasets are images and not videos.
And you are using a Convolutional layer in your encoder and decoder but Bi-Directional ConvLSTM in skip connection.

Can you share your findings? Does this really work?

Reference: Temporal Information Stackoverflow answer

rezazad68 commented 2 years ago

Thanks for your interest in our work. The objective of the Conv-LSTM here is not to learn spatial-temporal information. It acts as a mechanism to learn non-linear representation shared among decoding and encoding path. More specifically, we consider that the feature set coming from the encoder path contains more localization information while the decoder path brings semantic information, however, both tensors see the same object. To effectively map these two tensors into a combined version we use ConvLSTM.