aminullah6264 / Pytorch-Action-Recognition

Action Recognition in Video Sequences using Deep Bi-directional LSTM with CNN Features
https://sites.google.com/view/aminullah/home
43 stars 17 forks source link

why in your lstm train code file "n_classes=6"? #1

Closed Benzzzxxx closed 6 years ago

Benzzzxxx commented 6 years ago

Do you use n_classes=6 to train your model?

aminullah6264 commented 6 years ago

The code is used for a different number of datasets where each dataset has different number of classes so you can change it accordingly

Benzzzxxx commented 6 years ago

Ok! I see. Thank you for your answer. I am using Resnet-Inception-V2 to extract features and then train it using Blstm. But in my dataset, I still didn't achieve high accuracy, and have tried to modify many parameters. For every video, I extract features dimention is [ 40 ,1536 ], my dataset is similar to UCF101 and most videos has 250 seconds.What parameters like rnn_size ... should I set in your opinion?Or in your opinion ,what is the most important parameter in your code?

aminullah6264 commented 6 years ago

There are few parameters the chuck size means the frame-level features, for example, you have extracted 1500 features from one frame so your chunk size is 1500. its means for the training of your one sample you have 40 chucks. in my case, the chunk size is 6 because I wanted to learn sequential information in 6 frames. As your sequences are very long like 40 frames so change the RNN size up to 512 or more. Furthermore, it also depends on the feature representation of data for learning sequence if the data representation is good it will learn the sequences. Hope it helps you