Closed SecretMG closed 4 months ago
On our V100 servers, it takes about 13h for 5000 steps. Please consider using the map cache as described here for speedup and use as many CPU workers as you can.
So may I ask how long did the baseline model provided in the pic take to train for roughly? I also wonder is the baseline model trained with 61 frames with sweeps and generated annotations or just 16 frames?
It takes about 10 days to train with 8V100. If you have more GPUs or larger GPU mem, the training should take shorter time by adjusting the batch size and learning rate.
The baseline model only train with 16-frame generation. We train another model for 61-frame generation with the new configuration.
Hello,
Thank you for your excellent work! I would like to ask about the approximate training time for the video generation model. I followed the instructions from this link and used the command scripts/dist_train.sh 8 runner=8gpus_t +exp=rawbox_mv2.0t_0.4.3 for training. However, it took me over 30 hours to train for 5000 steps. I would like to know if this is normal because it was mentioned that approximately 80,000 steps are needed for training, which would take a considerable amount of time.
Thank you very much for your help!