Vivit Training Problem - Githubissues

First of all thank you for your excellent work! Let me talk about my configuration first. Set the model training hyperparameters according to the Training you gave. There are two main changes: changing the data set and using your Kinetics pre-training model. I am using the VGGSound dataset, which also splits the video into a sequence of RGB image frames as the dataset. The problem occurs in the model training phase. When using the pre-training model to initialize and train 1 epoch, the accuracy reaches 0.2, but the accuracy decreases as the training progresses. 2022-07-04 18:43:18 - Evaluating mean top1_acc:0.213, top5_acc:0.427 of current training epoch
2022-07-04 18:48:55 - Evaluating mean top1_acc:0.171, top5_acc:0.360 of current validation epoch
2022-07-04 21:08:07 - Evaluating mean top1_acc:0.197, top5_acc:0.430 of current training epoch
2022-07-04 21:12:59 - Evaluating mean top1_acc:0.071, top5_acc:0.202 of current validation epoch
2022-07-04 23:30:01 - Evaluating mean top1_acc:0.059, top5_acc:0.175 of current training epoch
2022-07-04 23:34:57 - Evaluating mean top1_acc:0.027, top5_acc:0.089 of current validation epoch
2022-07-05 01:46:54 - Evaluating mean top1_acc:0.029, top5_acc:0.102 of current training epoch
2022-07-05 01:51:35 - Evaluating mean top1_acc:0.017, top5_acc:0.060 of current validation epoch
2022-07-05 03:42:59 - Evaluating mean top1_acc:0.026, top5_acc:0.092 of current training epoch
2022-07-05 03:47:38 - Evaluating mean top1_acc:0.016, top5_acc:0.056 of current validation epoch
2022-07-05 05:42:18 - Evaluating mean top1_acc:0.027, top5_acc:0.096 of current training epoch
2022-07-05 05:46:48 - Evaluating mean top1_acc:0.013, top5_acc:0.054 of current validation epoch
2022-07-05 07:35:56 - Evaluating mean top1_acc:0.028, top5_acc:0.096 of current training epoch
2022-07-05 07:40:33 - Evaluating mean top1_acc:0.017, top5_acc:0.063 of current validation epoch
2022-07-05 09:32:25 - Evaluating mean top1_acc:0.028, top5_acc:0.099 of current training epoch
2022-07-05 09:37:00 - Evaluating mean top1_acc:0.017, top5_acc:0.066 of current validation epoch
2022-07-05 11:28:31 - Evaluating mean top1_acc:0.029, top5_acc:0.101 of current training epoch
2022-07-05 11:33:02 - Evaluating mean top1_acc:0.017, top5_acc:0.062 of current validation epoch

mx-mark / VideoTransformer-pytorch

Vivit Training Problem #20