cure-lab / MagicDrive

[ICLR24] Official implementation of the paper “MagicDrive: Street View Generation with Diverse 3D Geometry Control”
https://gaoruiyuan.com/magicdrive/
GNU Affero General Public License v3.0
664 stars 40 forks source link

Resume from Pretrained Checkpoint(Video) #77

Closed wenyi-li closed 2 months ago

wenyi-li commented 3 months ago

Hi,

I would like to continue training (finetune) using the video branch ckpt you released: SDv1.5mv-rawbox-t_2023-12-04_17-51_2.0t_0.4.3E4-S77040. However, I encountered the following issues:

  1. When I try to resume training using the command line with resume_from_checkpoint={path_to_ckpt}, I get an error indicating that the controlnet cannot be found.

  2. I modified the UNet loading logic in the code like this:

    unet = UNet2DConditionModelMultiview.from_pretrained("./MagicDrive-pretrained/SDv1.5mv-rawbox-t_2023-12-04_17-51_2.0t_0.4.3E4-S77040/weight-E4-S77040/unet")

    However, after fine-tuning for only 50 steps (using a warmup_to_constant lr strategy), the FVD of generated videos increased from 221.8254 (from the original checkpoint) to 394.1425.

Could you please help me understand why this is happening and how I can continue training based on the ckpt provided for video generation?

Thank you!

flymin commented 3 months ago
  1. Please understand the difference listed in FAQ. You may check how test.py resume a pre-trained model.
  2. You can do testing first before launching your training flow. Please check and use validation_before_run in the config.