shibing624 / textgen

TextGen: Implementation of Text Generation models, include LLaMA, BLOOM, GPT2, BART, T5, SongNet and so on. 文本生成模型,实现了包括LLaMA,ChatGLM,BLOOM,GPT2,Seq2Seq,BART,T5,UDA等模型的训练和预测,开箱即用。
Apache License 2.0
938 stars 109 forks source link

README中LLaMA模型部分是不是还没有更新+merge_peft_adapter.py的save_pretrained()似乎有问题 #46

Closed PolarisRisingWar closed 1 year ago

PolarisRisingWar commented 1 year ago

我看好像LLaMA模型从llamamodel换成gptmodel了?

一个是LLaMA的代码示例,还有一个模型转换部分,我是直接用Python执行实现的:

python whj_download/githubs/textgen/textgen/gpt/merge_peft_adapter.py \
    --base_model_name_or_path /data/whj/pretrained_model/llama-7b/llama \
    --peft_model_path /data/whj/pretrained_model/chinese-alpaca-lora-7b \
    --output_type huggingface \
    --output_dir /data/whj/pretrained_model/chinese-alpaca-plus-7b-hf \
    --offload_dir /data/whj/cache

whj_download/githubs/textgen/textgen/gpt/merge_peft_adapter.py这个代码原303行AutoModelForCausalLM.save_pretrained(base_model, output_dir)这个是否是transformers版本的问题?我是transformers 4.30.2,需要改成base_model.save_pretrained(output_dir)才能运行,否则会报AttributeError: type object 'AutoModelForCausalLM' has no attribute 'save_pretrained'

shibing624 commented 1 year ago

嗯,我更新下。