OpenDriveLab / Vista

A Generalizable World Model for Autonomous Driving
https://vista-demo.github.io
Apache License 2.0
362 stars 16 forks source link

How to load the pretrained safesensor and continue to train? #13

Open JunyuanDeng opened 1 week ago

JunyuanDeng commented 1 week ago

Hello, Thanks for your sharing code!

I am now try to train the stage 2 with the provided vista.safetensors

So I change the command to below:

torchrun \
    --nnodes=1 \
    --nproc_per_node=8 \
    train.py \
    --base configs/training/vista_phase2_stage2.yaml \
    --finetune ${PATH_TO_STAGE1_CKPT}/vista.safetensors \
    --num_nodes 1 \
    --n_devices 8

But there are lots of missing keys like: image

And the loss, in my expectation, should be low, which is not true in my observation: image

I download the sampled video "samples_mp4_epoch00_batch0000_step000001.mp4":

https://github.com/OpenDriveLab/Vista/assets/62542727/80f5237f-9d68-46f5-8d5b-9ec0b5587b63

What should I do to use the provided weight to start the phase 2 stage 2 traning?