ymcui / Chinese-LLaMA-Alpaca-3

中文羊驼大模型三期项目 (Chinese Llama-3 LLMs) developed from Meta Llama 3
Apache License 2.0
1.55k stars 141 forks source link

Merge完的模型在inference出錯 #69

Closed MonetCH closed 3 months ago

MonetCH commented 3 months ago

提交前必须检查以下项目

问题类型

模型训练与精调

基础模型

Llama-3-Chinese-8B(基座模型)

操作系统

Linux

详细描述问题

# 请在此处粘贴运行代码(请粘贴在本代码块里)
使用 inference.py 運行自己pre-trained完之後merge的模型會出現問題
Traceback (most recent call last):
  File "/mlsteam/data/Q21/nick/Chinese-LLaMA-Alpaca-3/scripts/inference/inference_hf.py", line 105, in <module>
    model = AutoModelForCausalLM.from_pretrained(
  File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/auto_factory.py", line 563, in from_pretrained
    return model_class.from_pretrained(
  File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 3677, in from_pretrained
    ) = cls._load_pretrained_model(
  File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 4084, in _load_pretrained_model
    state_dict = load_state_dict(shard_file, is_quantized=is_quantized)
  File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 507, in load_state_dict
    with safe_open(checkpoint_file, framework="pt") as f:
safetensors_rust.SafetensorError: Error while deserializing header: HeaderTooLarge

我使用的是chinese-llama2的run_pt.sh腳本運行預訓練,然後使用merge_llama3_with_chinese_lora_low_mem.py進行merge。
base_model為 meta-llama/Meta-Llama-3-8B。

另外在merge過程中,有將adapter_config.json中的enable_lora和merge_weights刪除。

依赖情况(代码类问题务必提供)

# 请在此处粘贴依赖情况(请粘贴在本代码块里)
bitsandbytes              0.43.1
peft                      0.7.1
pytorch-quantization      2.1.2
torch                     2.3.0a0+ebedce2
torch-tensorrt            2.3.0a0
torchdata                 0.7.1a0
torchtext                 0.17.0a0
torchvision               0.18.0a0
transformers              4.40.0

运行日志或截图

# 请在此处粘贴运行日志(请粘贴在本代码块里)
Traceback (most recent call last):
  File "/mlsteam/data/Q21/nick/Chinese-LLaMA-Alpaca-3/scripts/inference/inference_hf.py", line 105, in <module>
    model = AutoModelForCausalLM.from_pretrained(
  File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/auto_factory.py", line 563, in from_pretrained
    return model_class.from_pretrained(
  File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 3677, in from_pretrained
    ) = cls._load_pretrained_model(
  File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 4084, in _load_pretrained_model
    state_dict = load_state_dict(shard_file, is_quantized=is_quantized)
  File "/usr/local/lib/python3.10/dist-packages/transformers/modeling_utils.py", line 507, in load_state_dict
    with safe_open(checkpoint_file, framework="pt") as f:
safetensors_rust.SafetensorError: Error while deserializing header: HeaderTooLarge
ymcui commented 3 months ago

应该是训练和合并过程有问题,导致模型不完整。 不过既然是精调llama-3,为什么不使用本项目(三代)中的代码进行训练,而是使用二代的代码? 三代代码我们已经测试过没有问题,不需要你提到的修改adapter_config.json相关参数的步骤就可以正常合并和推理。

MonetCH commented 3 months ago

您好,因为当初3代的代码还没释出,所以先用2代的代码做尝试,所以有可能是2代和3代的代码不同导致?

ymcui commented 3 months ago

不保证2代的代码能适配llama-3,所以建议你使用本项目中的代码来训练llama-3。

MonetCH commented 3 months ago

了解,感谢