mymusise / ChatGLM-Tuning

基于ChatGLM-6B + LoRA的Fintune方案
MIT License
3.74k stars 441 forks source link

ValueError: weight is on the meta device, we need a `value` to put in on 0. #120

Open hurun opened 1 year ago

hurun commented 1 year ago

system: centos 7 cuda version: system=cuda-11.2, conda env cudatoolkit=11.6.0 python=3.8

按照requirement配置安装环境, 执行指令报错如下

python finetune.py \
    --dataset_path data/alpaca \
    --lora_rank 8 \
    --per_device_train_batch_size 6 \
    --gradient_accumulation_steps 1 \
    --max_steps 52000 \
    --save_steps 1000 \
    --save_total_limit 2 \
    --learning_rate 1e-4 \
    --fp16 \
    --remove_unused_columns false \
    --logging_steps 50 \
    --output_dir output

报错如下

===================================BUG REPORT===================================
Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
================================================================================
CUDA SETUP: CUDA runtime path found: /appletree/miniconda3/envs/chatglm/lib/libcudart.so
CUDA SETUP: Highest compute capability among GPUs detected: 7.0
CUDA SETUP: Detected CUDA version 116
/appletree/miniconda3/envs/chatglm/lib/python3.8/site-packages/bitsandbytes/cuda_setup/main.py:136: UserWarning: WARNING: Compute capability < 7.5 detected! Only slow 8-bit matmul is supported for your GPU!
  warn(msg)
CUDA SETUP: Loading binary /appletree/miniconda3/envs/chatglm/lib/python3.8/site-packages/bitsandbytes/libbitsandbytes_cuda116_nocublaslt.so...
Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
Explicitly passing a `revision` is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision.
Explicitly passing a `revision` is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision.
Overriding torch_dtype=None with `torch_dtype=torch.float16` due to requirements of `bitsandbytes` to enable model loading in mixed int8. Either pass torch_dtype=torch.float16 or don't pass this argument at all to remove this warning.
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:13<00:00,  1.75s/it]
Traceback (most recent call last):
  File "finetune.py", line 121, in <module>
    main()
  File "finetune.py", line 80, in main
    model = AutoModel.from_pretrained(
  File "/appletree/miniconda3/envs/chatglm/lib/python3.8/site-packages/transformers/models/auto/auto_factory.py", line 466, in from_pretrained
    return model_class.from_pretrained(
  File "/appletree/miniconda3/envs/chatglm/lib/python3.8/site-packages/transformers/modeling_utils.py", line 2697, in from_pretrained
    dispatch_model(model, device_map=device_map, offload_dir=offload_folder, offload_index=offload_index)
  File "/appletree/miniconda3/envs/chatglm/lib/python3.8/site-packages/accelerate/big_modeling.py", line 370, in dispatch_model
    attach_align_device_hook_on_blocks(
  File "/appletree/miniconda3/envs/chatglm/lib/python3.8/site-packages/accelerate/hooks.py", line 471, in attach_align_device_hook_on_blocks
    add_hook_to_module(module, hook)
  File "/appletree/miniconda3/envs/chatglm/lib/python3.8/site-packages/accelerate/hooks.py", line 155, in add_hook_to_module
    module = hook.init_hook(module)
  File "/appletree/miniconda3/envs/chatglm/lib/python3.8/site-packages/accelerate/hooks.py", line 244, in init_hook
    set_module_tensor_to_device(module, name, self.execution_device)
  File "/appletree/miniconda3/envs/chatglm/lib/python3.8/site-packages/accelerate/utils/modeling.py", line 136, in set_module_tensor_to_device
    raise ValueError(f"{tensor_name} is on the meta device, we need a `value` to put in on {device}.")
ValueError: weight is on the meta device, we need a `value` to put in on 0.
jackaduma commented 1 year ago

同问...

tomcat123a commented 1 year ago

有多张gpu加载的时候用chatglm6b github里面的多卡加载

Mike-ihr commented 2 months ago

you need to set the device_map = "auto" when you use the from_pretrained function. More details can refer to https://huggingface.co/docs/accelerate/usage_guides/big_modeling