dvlab-research / LLaMA-VID

LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models (ECCV 2024)
Apache License 2.0
693 stars 43 forks source link

How to resume the checkpoint to continue pretraining? #84

Open Einstone-rose opened 5 months ago

Einstone-rose commented 5 months ago

I find a issue when I attempt to continue pretraining?

trainer.train(resume_from_checkpoint=True) File "/opt/conda/lib/python3.8/site-packages/transformers/trainer.py", line 1539, in train return inner_training_loop( File "/opt/conda/lib/python3.8/site-packages/transformers/trainer.py", line 1676, in _inner_training_loop deepspeed_load_checkpoint(self.model_wrapped, resume_from_checkpoint) File "/opt/conda/lib/python3.8/site-packages/transformers/deepspeed.py", line 389, in deepspeed_load_checkpoint raise ValueError(f"Can't find a valid checkpoint at {checkpoint_path}") ValueError: Can't find a valid checkpoint at /ossfs/workspace/mnt_new/xxx/llama-vid/work_dirs/llama-vid-7b-pretrain-224-video-fps-1/checkpoint-15000