MCG-NJU / MixFormer

[CVPR 2022 Oral & TPAMI 2024] MixFormer: End-to-End Tracking with Iterative Mixed Attention
https://arxiv.org/abs/2203.11082
MIT License
457 stars 75 forks source link

Training requirement #73

Closed CarlHuangNuc closed 1 year ago

CarlHuangNuc commented 1 year ago

Hi,

 Could you share some GPU information and Length of training time ? I try to reproduce your paper , but found this paper have a lot of dependence on High level training platform (A100 or more than A100).
yutaocui commented 1 year ago

For MixViT-B (ViT-B or ConvMAE-B) training, we use 8 2080ti or Tesla v100 GPUs and cost about 50+ hours. For MixViT-L (ViT-L or ConvMAE-L), 8 Tesla v100 or RTX 8000 GPUs are used for training around 4+ days. In fact, maybe 300 epochs instead of 500 epochs as reported in our paper are enough for training, which can save the training time.

CarlHuangNuc commented 1 year ago

Thanks,