Closed charlesCXK closed 2 years ago
Thanks for your suggestion. I will answer your question in a few days.
hardware setup : 8 * A6000 with per GPU holding 128 images
MAE 0.4084s per iteration, GPU memory printed by nvidia-smi command : 17926MB
ConvMAE 0.8306s per iteration, GPU memory printed by nvidia-smi command : 27049MB
ConvMAE by skipping masked region computation in stage 1 and stage 2 0.5480s per iteration, GPU memory printed by nvidia-smi command : 21250MB
Thank you so much for reminding us about training speed comparison. We will include speed/GPU memory/FLOPs comparison in updated version.
Thanks for your detailed reply!
By default, ConvMAE stands for ConvMAE + Multi-scale Decoder proposed in our paper.
We are going to release Fast ConvMAE which can significantly accelerate the pretraining of ConvMAE in a few days. https://github.com/Alpha-VL/FastConvMAE
Fast ConvMAE have been released which accelerates the pretraining time by half.
Dear author: Thank you for sharing the excellent work! May I ask how the time overhead of ConvMAE pre-training compares to MAE? Can you provide the time required to train an epoch for these two methods on the same type of GPU?