mx-mark / VideoTransformer-pytorch

PyTorch implementation of a collections of scalable Video Transformer Benchmarks.
272 stars 34 forks source link

Vivit Training Problem #20

Closed muzhaohui closed 2 years ago

muzhaohui commented 2 years ago

First of all thank you for your excellent work! Let me talk about my configuration first. Set the model training hyperparameters according to the Training you gave. There are two main changes: changing the data set and using your Kinetics pre-training model. I am using the VGGSound dataset, which also splits the video into a sequence of RGB image frames as the dataset. The problem occurs in the model training phase. When using the pre-training model to initialize and train 1 epoch, the accuracy reaches 0.2, but the accuracy decreases as the training progresses. 2022-07-04 18:43:18 - Evaluating mean top1_acc:0.213, top5_acc:0.427 of current training epoch
2022-07-04 18:48:55 - Evaluating mean top1_acc:0.171, top5_acc:0.360 of current validation epoch
2022-07-04 21:08:07 - Evaluating mean top1_acc:0.197, top5_acc:0.430 of current training epoch
2022-07-04 21:12:59 - Evaluating mean top1_acc:0.071, top5_acc:0.202 of current validation epoch
2022-07-04 23:30:01 - Evaluating mean top1_acc:0.059, top5_acc:0.175 of current training epoch
2022-07-04 23:34:57 - Evaluating mean top1_acc:0.027, top5_acc:0.089 of current validation epoch
2022-07-05 01:46:54 - Evaluating mean top1_acc:0.029, top5_acc:0.102 of current training epoch
2022-07-05 01:51:35 - Evaluating mean top1_acc:0.017, top5_acc:0.060 of current validation epoch
2022-07-05 03:42:59 - Evaluating mean top1_acc:0.026, top5_acc:0.092 of current training epoch
2022-07-05 03:47:38 - Evaluating mean top1_acc:0.016, top5_acc:0.056 of current validation epoch
2022-07-05 05:42:18 - Evaluating mean top1_acc:0.027, top5_acc:0.096 of current training epoch
2022-07-05 05:46:48 - Evaluating mean top1_acc:0.013, top5_acc:0.054 of current validation epoch
2022-07-05 07:35:56 - Evaluating mean top1_acc:0.028, top5_acc:0.096 of current training epoch
2022-07-05 07:40:33 - Evaluating mean top1_acc:0.017, top5_acc:0.063 of current validation epoch
2022-07-05 09:32:25 - Evaluating mean top1_acc:0.028, top5_acc:0.099 of current training epoch
2022-07-05 09:37:00 - Evaluating mean top1_acc:0.017, top5_acc:0.066 of current validation epoch
2022-07-05 11:28:31 - Evaluating mean top1_acc:0.029, top5_acc:0.101 of current training epoch
2022-07-05 11:33:02 - Evaluating mean top1_acc:0.017, top5_acc:0.062 of current validation epoch

mx-mark commented 2 years ago

@muzhaohui Sorry for the late reply. In my opinion, when you start a new project, the first step is to check if the model can overfit one sample of VGGSound. Because from your training log, it can not overfit the training set.