HHTseng / video-classification

Tutorial for video classification/ action recognition using 3D CNN/ CNN+RNN on UCF101
916 stars 216 forks source link

LSTM #15

Closed dmitrysarov closed 4 years ago

dmitrysarov commented 4 years ago

https://github.com/HHTseng/video-classification/blob/82d85e8c2a5dff3eea66e4deff1d927a7144fc00/CRNN/functions.py#L345

input of shape (seq_len, batch, input_size): tensor containing the features of the input sequence. The input can also be a packed variable length sequence. See torch.nn.utils.rnn.pack_padded_sequence() or torch.nn.utils.rnn.pack_sequence() for details.

Why you feed LSTM with (batch, seq_len, input_size)?

dmitrysarov commented 4 years ago

image

dmitrysarov commented 4 years ago

oh i got it batch_first=True