Open DongdongY1 opened 2 months ago
I also noticed that there is a time embedding generating function which is not utilized. In the paper you mentioned "The positional information is not embedded, because the locations in a long temporal range would not be helpful as claimed in (Chen et al. 2020)." I'm wondering if you've done experiments to verify this, as I am trying to add time embedding to further improve the model.
Update: I finished training using the cmd above and got 76.27%mAP which I think is roughtly the same to the table.
Update: I finished training using the cmd above and got 76.27%mAP which I think is roughtly the same to the table.
try setting reference frame number to 32, you may get a higher AP50.
Update: I finished training using the cmd above and got 76.27%mAP which I think is roughtly the same to the table.
try setting reference frame number to 32, you may get a higher AP50.
Thanks, still I have confusions as above. Could you give some explanations?
Hello, I'm trying to train YOLOV to reproduce the result on VID. I think your workflow would be
However, if the provided YOLOX weights was used to be the baseline, I suppose it should already trained on DET&VID. Then if I download the YOLOX weight and run the cmd
Is it means that the model would train on DET&VID twice? And, running this line results in a 7-epoch training, which seems to be DET training phase. Will VID phase automatically start following it?