模型没有加载完，gpu利用率已经是100%了

Reminder

[X] I have read the README and searched the existing issues.

System Info

Reproduction

如题，运行命令如下，模型还在加载过程中，显存利用率已经100%了，这是bug么？ export WANDB_DISABLED=true deepspeed --master_port=9903 --num_gpus 8 src/train.py \ --deepspeed ./examples/deepspeed/ds_z3_config.json \ --stage sft \ --do_train \ --model_name_or_path --dataset --template qwen \ --finetuning_type lora \ --lora_target q_proj,v_proj \ --output_dir --overwrite_cache \ --per_device_train_batch_size 4 \ --gradient_accumulation_steps 4 \ --lr_scheduler_type cosine \ --logging_steps 10 \ --save_steps 200 \ --learning_rate 5e-5 \ --num_train_epochs 10 \ --plot_loss \ --bf16 \ --save_only_model \ --overwrite_output_dir

Loading checkpoint shards: 49%|███████████████▌ | 18/37 [1:05:31<2:18:39, 437.87s/it]

Expected behavior

No response

Others

No response

hiyouga / LLaMA-Factory