cure-lab / MagicDrive

[ICLR24] Official implementation of the paper “MagicDrive: Street View Generation with Diverse 3D Geometry Control”
https://gaoruiyuan.com/magicdrive/
GNU Affero General Public License v3.0
664 stars 40 forks source link

About training time #85

Closed mzsqw closed 2 months ago

mzsqw commented 2 months ago

Hi, Thank you for your impressive work!

I started training on 8 V100s with all default Settings and the following command, showing that training time for 100 epochs takes one week, is this normal?

accelerate launch --mixed_precision fp16 --gpu_ids all --num_processes 8 --num_machines 1 --main_process_port 29501 tools/train.py +exp=224x400 runner=8gpus

image

flymin commented 2 months ago

dup. video, see #64, #52 image, see #9

mzsqw commented 2 months ago

dup. video, see #64, #52 image, see #9

Sorry, but I didn't find the training time for image training in #9 . Did you put the wrong link?

flymin commented 2 months ago

14 show the speed, #9 has solutions for speed up, which is also in our readme.

mzsqw commented 2 months ago

14 show the speed, #9 has solutions for speed up, which is also in our readme.

Thank you for your reply. I noticed that you mentioned that 2s/it speed is relatively normal, is there some problem with my 5s/it speed on the V100?Everything followed the default Settings.

If I try the cache files in h5 format for BEV maps, Will the speed be increased to 2s/it?

flymin commented 2 months ago

If you have any follow-ups, please use the existing issues.