Open ArchchanaKugathasan opened 6 days ago
I won't have time to maintain strong support for all kinds of non-llama based models. But I'll tell you what I notice from your report:
Thank you very much for your prompt reply.
I have tried this model with 4 bit quantization and LoRA without regression head just prompting using another script, it worked. Would this be the problem in the transformer_head library?
Also is there any possibility we can use your code without QLoRA? if so what needs to be changed in the script?
Well, there is https://github.com/center-for-humans-and-machines/transformer-heads/blob/main/notebooks/gpt2/text_classification_full_finetune.ipynb. Haven't tested that in a while though, given that I am rarely in situations where I have enough GPU VRAM to do full finetuning.
Thank you :) , I will check this.
I have checked whether this model (CohereForAI/aya-expanse-8b) supports QLoRA, and according to the following tutorial, it confirms this model supports QLoRA -[https://youtu.be/ChIxwXCI9aY?si=sq09vRJFkrsx-8El]. Also, I have run a few tests which prove this model supports QLoRA.
So I am wondering what could be possibly causing this error
Traceback (most recent call last):
File "/vol/research/Archchana/Experiments/regression_head_Mat/exp-4/aya_train_multilingual-GEMBA.py", line 270, in
Hi,
I have tried running the CohereForAI/aya-expanse-8b model. I added the following code to your script
---------------------------------CODE CHANGE 1----------------------------------------------------- from transformer_heads.constants import model_type_map, loss_fct_map import torch.nn as nn from transformers import AutoModelForCausalLM
loss_fct_map["nll"] = nn.NLLLoss() model_type_map["auto"] = ("model", AutoModelForCausalLM)
for the above CODE CHANGE 1, I got the following error
------------------------------CODE CHANGE 1 ERROR --------------------
Traceback (most recent call last): File "/vol/research/Archchana/Experiments/regression_head_Mat/exp-4/train_multilingual-GEMBA.py", line 185, in
model = create_headed_qlora(
File "/vol/research/Archchana/Anaconda3/envs/transhead2/lib/python3.10/site-packages/transformer_heads/util/load_model.py", line 256, in create_headed_qlora
model: HeadedModel = model.from_pretrained(
File "/vol/research/Archchana/Anaconda3/envs/transhead2/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3832, in from_pretrained
model = cls(config, *model_args, **model_kwargs)
File "/vol/research/Archchana/Anaconda3/envs/transhead2/lib/python3.10/site-packages/transformer_heads/model/model.py", line 699, in init
model_type_map[config.model_type][0],
KeyError: 'cohere'
So I changed the code to the following ---------------------------------CODE CHANGE 2-----------------------------------------------------
def cohere_model_loader(config): return AutoModelForCausalLM.from_pretrained(config._name_or_path, trust_remote_code=True)
model_type_map["cohere"] = ("model", cohere_model_loader)
when I make this change it shows the following error message.
---------------------------------CODE CHANGE 2 ERROR ----------------------------------------------------- Traceback (most recent call last): File "/vol/research/Archchana/Experiments/regression_head_Mat/exp-4/aya_train_multilingual-GEMBA.py", line 193, in
model = create_headed_qlora(
File "/vol/research/Archchana/Anaconda3/envs/transhead2/lib/python3.10/site-packages/transformer_heads/util/load_model.py", line 268, in create_headed_qlora
model = prepare_model_for_kbit_training(
File "/vol/research/Archchana/Anaconda3/envs/transhead2/lib/python3.10/site-packages/peft/utils/other.py", line 116, in prepare_model_for_kbit_training
model.enable_input_require_grads()
File "/vol/research/Archchana/Anaconda3/envs/transhead2/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1761, in enable_input_require_grads
self._require_grads_hook = self.get_input_embeddings().register_forward_hook(make_inputs_require_grads)
File "/vol/research/Archchana/Anaconda3/envs/transhead2/lib/python3.10/site-packages/transformers/models/cohere/modeling_cohere.py", line 994, in get_input_embeddings
return self.model.embed_tokens
File "/vol/research/Archchana/Anaconda3/envs/transhead2/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1709, in getattr
raise AttributeError(f"'{type(self).name}' object has no attribute '{name}'")
AttributeError: 'CohereForCausalLM' object has no attribute 'embed_tokens'
Could you please hep me with this issue.
Thank you!