shuxueslpi / chatGLM-6B-QLoRA

使用peft库,对chatGLM-6B/chatGLM2-6B实现4bit的QLoRA高效微调,并做lora model和base model的merge及4bit的量化(quantize)。
356 stars 46 forks source link

大佬Qlora是怎么调的? #14

Closed white-wolf-tech closed 1 year ago

white-wolf-tech commented 1 year ago

You are loading your model in 8bit or 4bit but no linear modules were found in your model. this can happen for some architectures such as gpt2 that uses Conv1D instead of Linear layers. Please double check your model architecture, or submit an issue on github if you think this is a bug.

然后执行:prepare_model_for_kbit_training就OOM,看log信息,好像Chatglm2-6b的权重并没有被量化。 到prepare_model_for_kbit_training,把未量化权重全部转位float32,然后就OOM了。

使用的模型权重以及代码是今天最新的

shuxueslpi commented 1 year ago

@Coder-nlper chatglm-6b和chatglm2-6b的应该都可以被量化的,12G的显存就可以把例子数据集完整跑下来。 你有修改代码? 你的运行环境是什么样的? 可以打印出你的模型看看,我的是这样的:

In [12]: base_model Out[12]: ChatGLMForConditionalGeneration( (transformer): ChatGLMModel( (embedding): Embedding( (word_embeddings): Embedding(65024, 4096) ) (rotary_pos_emb): RotaryEmbedding() (encoder): GLMTransformer( (layers): ModuleList( (0-27): 28 x GLMBlock( (input_layernorm): RMSNorm() (self_attention): SelfAttention( (query_key_value): Linear4bit(in_features=4096, out_features=4608, bias=True) (core_attention): CoreAttention( (attention_dropout): Dropout(p=0.0, inplace=False) ) (dense): Linear4bit(in_features=4096, out_features=4096, bias=False) ) (post_attention_layernorm): RMSNorm() (mlp): MLP( (dense_h_to_4h): Linear4bit(in_features=4096, out_features=27392, bias=False) (dense_4h_to_h): Linear4bit(in_features=13696, out_features=4096, bias=False) ) ) ) (final_layernorm): RMSNorm() ) (output_layer): Linear(in_features=4096, out_features=65024, bias=False) ) )

white-wolf-tech commented 1 year ago

我在这里https://github.com/THUDM/ChatGLM2-6B/issues/163 寻找量化后的层,发现没有,而baichuan模型是有的,第一代chatglm6B也是有的

shuxueslpi commented 1 year ago

@Coder-nlper 我这里看是有的: 1688393066209

white-wolf-tech commented 1 year ago

问题找到了,必须transformers==4.30.2,我用的transformers==4.31.0.dev是不行的

white-wolf-tech commented 1 year ago

感谢大佬~

xiaojinchuan commented 1 year ago

@Coder-nlper 请教一下: 我在执行prepare_model_for_kbit_training的时候,会报错: File "/home/jinxiao/code/learn/llm-fruit/model.py", line 123, in create_model model = prepare_model_for_kbit_training(model, use_gradient_checkpointing=training_args.gradient_checkpointing) File "/home/jinxiao/miniconda3/envs/torch2.0_cu11.7/lib/python3.10/site-packages/peft/utils/other.py", line 86, in prepare_model_for_kbit_training model.enable_input_require_grads() File "/home/jinxiao/miniconda3/envs/torch2.0_cu11.7/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1206, in enable_input_require_grads self._require_grads_hook = self.get_input_embeddings().register_forward_hook(make_inputs_require_grads) File "/home/jinxiao/miniconda3/envs/torch2.0_cu11.7/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1223, in get_input_embeddings return base_model.get_input_embeddings() File "/home/jinxiao/miniconda3/envs/torch2.0_cu11.7/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1225, in get_input_embeddings raise NotImplementedError 我transformers的版本是4.30.2

shuxueslpi commented 1 year ago

@xiaojinchuan 你需要更新下模型的那个文件夹到最新的版本,尤其是里面的.py文件,如果你是用git lfs拉取的模型,直接git pull就可以了

xiaojinchuan commented 1 year ago

确实最新的版本已经修复了。非常感谢!

ShayDuane commented 1 year ago

@shuxueslpi chatglm2没法量化的问题解决了吗

ShayDuane commented 1 year ago

@xiaojinchuan 没法量化的问题你解决了吗

shuxueslpi commented 1 year ago

@RuSignalFlag 可以量化的啊,更新到新版的transformers==4.30.2