Avoiding the frozen when save checkpoint using LoRA finetune.

DAMO-NLP-SG / VideoLLaMA2

VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs

Apache License 2.0

871 stars 60 forks source link

Avoiding the frozen when save checkpoint using LoRA finetune. #48

Closed hill2hill closed 4 months ago

hill2hill commented 4 months ago

When avoiding the use of LoRA for multi-GPU training to obtain the state_dict, if tensors are distributed across multiple GPUs, it can lead to a situation where they cannot be retrieved, causing the process to stall.

just follow the style in train.py

clownrat6 commented 4 months ago

When avoiding the use of LoRA for multi-GPU training to obtain the state_dict, if tensors are distributed across multiple GPUs, it can lead to a situation where they cannot be retrieved, causing the process to stall.

just follow the style in train.py

Thanks for your fixing.