OpenGVLab / InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
https://internvl.readthedocs.io/en/latest/
MIT License
6.15k stars 478 forks source link

internvl2 video data train CUDA out of memory #547

Open gaotiexinqu opened 3 months ago

gaotiexinqu commented 3 months ago

我正在对internvl2使用视频数据进行full finetune,显卡为单张32G V100,报错torch.cuda.OutOfMemoryError: CUDA out of memory.

torchrun /cache/InternVL/internvl_chat/internvl/train/internvl_chat_finetune.py \ --model_name_or_path /cache/MODELS/internvl2-4B \ --conv_style "phi3-chat" \ --output_dir /cache/InternVL/OUTPUTS/internvl_chat_v1_5_phi3_3_8b_dynamic_res_finetune_debug_load_2nd \ --meta_path /cache/InternVL/internvl_chat/shell/data/internvl_1_2_finetune_7k.json \ --overwrite_output_dir True \ --force_image_size 448 \ --max_dynamic_patch 1 \ --down_sample_ratio 0.5 \ --drop_path_rate 0.1 \ --freeze_llm False \ --freeze_mlp False \ --freeze_backbone True \ --vision_select_layer -1 \ --dataloader_num_workers 4 \ --bf16 False \ --fp16 True \ --num_train_epochs 1 \ --per_device_train_batch_size 1 \ --gradient_accumulation_steps 4 \ --evaluation_strategy "no" \ --save_strategy "steps" \ --save_steps 200 \ --save_total_limit 1 \ --learning_rate 4e-5 \ --weight_decay 0.05 \ --warmup_ratio 0.03 \ --lr_scheduler_type "cosine" \ --logging_steps 1 \ --max_seq_length 4096 \ --do_train True \ --grad_checkpoint True \ --group_by_length True \ --dynamic_image_size True \ --use_thumbnail True \ --ps_version 'v2' \ --deepspeed /cache/ZYM/InternVL/internvl_chat/zero_stage2_config.json \ --report_to "tensorboard" \ 2>&1 | tee -a /cache/ZYM/InternVL/OUTPUTS/internvl_chat_v1_5_phi3_3_8b_dynamic_res_finetune_debug/training_log.txt

class LazySupervisedDataset(Dataset): min_num_frame=1, # for video data max_num_frame=1, # for video data

使用decord库加载视频,frames = read_frames_decord(fn, num_frames=max_num_frames, min_num_frames=min_num_frames, sample=sample, clip=clip)

batch_size设置为1, 虽然似乎视频数据不使用动态高分辨率,但我仍然将max_dynamic_patch设置为1 video_dataloader中将加载帧数固定为1

以上setting仍然显示显存占用溢出...

ErfeiCui commented 2 months ago

显存爆的话,可以把--freeze_llm 设为True,应该就不会溢出了