Epiphqny / VisTR

[CVPR2021 Oral] End-to-End Video Instance Segmentation with Transformers
https://arxiv.org/abs/2011.14503
Apache License 2.0
738 stars 96 forks source link

Too many iterations in ONE EPOCH? #43

Closed JialianW closed 3 years ago

JialianW commented 3 years ago

Dear authors,

Thanks for your great open-source work. I have a question regarding the training:

In each epoch, the number of iterations equals to the number of images. However, in each iteration, the input is a whole video which contains 36 images. That said, in average, one image is trained 36 times in the same epoch. In general, I think a common way is to set the number of iterations to be equal to the number of videos, such that one image is only seen once in each epoch. I am wondering why there are such many iterations in one epoch? Is it specially set for the method, or just for a convenient implementation?

Thanks a lot! Look forward hearing from you.

Epiphqny commented 3 years ago

@JialianW You are right, that is just for a convenient implementation.

JialianW commented 3 years ago

May I know what is the performance of the first epoch and first few epochs? It would be very helpful for us as a reference. Thanks.

Epiphqny commented 3 years ago

@JialianW It would be around 30.0 mAP.