The accuracy of 16 frames is worse than 8 frames

Hi author, I replaced the dataset with the smaller HMDB51 dataset, and after some experimentation, I found that 16 frames is less accurate than 8 frames: 16 fps: 70.33% 8 fps: 71.96% We think the result is incredible！！ The backbone we used is VIT-B/16

Due to our limited experimental environment, we train ILA on one 3090 GPU and we ues the following parameters: lr = 8e-6, batchsize = 4

We debugged the learning rate lr, and we found that ILA seems to be very sensitive to the learning rate, and when increasing the learning rate by a factor of 10, the accuracy is only about 15%.

Francis-Rings / ILA

The accuracy of 16 frames is worse than 8 frames #4