Closed jahangmar closed 1 year ago
Even I am facing same issue
Did anyone manage to fix it?
any update on this issue? I also face the same problem
Have also seen this issue when loading trained pretrained lora checkpoints. Model works fine after being trained, and then when using PeftModel.from_pretrained
- the model is throwing a similar error
The notebook is working fine locally too. I'm unable to reproduce the issue. @eware-godaddy, could you please share a minimal reproducible example for us to deep dive into the issue?
Could everyone facing the issue try installing the main branch and see if that resolves the issue?
I have tried different versions of peft and transformers with no success. I was able to run the code by removing the two following lines:
fp16=True,
and
model = prepare_model_for_int8_training(model)
Alternatively, I can load the model without load_in_8bit=True
and then the fp16=True
flag works.
The following code where the parameters are manually casted is also working as long as I don't cast those with ndim == 1 to float and don't use fp16 training.
import os
os.environ["CUDA_VISIBLE_DEVICES"] = "0"
import torch
import torch.nn as nn
import bitsandbytes as bnb
from transformers import AutoTokenizer, AutoConfig, AutoModelForCausalLM
model_path='facebook/opt-125m'
model = AutoModelForCausalLM.from_pretrained(model_path, load_in_8bit=True, torch_dtype=torch.float16, device_map={"": 0})
tokenizer = AutoTokenizer.from_pretrained(model_path, device_map={"": 0})
for param in model.parameters():
param.requires_grad = False
# if param.ndim == 1:
# param.data = param.data.to(torch.float32) #leads to "RuntimeError: expected scalar type Half but found Float"
model.gradient_checkpointing_enable()
model.enable_input_require_grads()
class CastOutputToFloat(nn.Sequential):
def forward(self, x): return super().forward(x).to(torch.float32)
model.lm_head = CastOutputToFloat(model.lm_head)
from peft import LoraConfig, get_peft_model
config = LoraConfig(
r=16, lora_alpha=32, target_modules=["q_proj", "v_proj"], lora_dropout=0.05, bias="none", task_type="CAUSAL_LM"
)
model = get_peft_model(model, config)
import transformers
from datasets import load_dataset
data = load_dataset("Abirate/english_quotes")
data = data.map(lambda samples: tokenizer(samples["quote"]), batched=True)
trainer = transformers.Trainer(
model=model,
train_dataset=data["train"],
args=transformers.TrainingArguments(
per_device_train_batch_size=1,
gradient_accumulation_steps=1,
warmup_steps=100,
max_steps=200,
learning_rate=2e-4,
logging_steps=1,
output_dir='outputs',
# fp16=True, #leads to "RuntimeError: expected scalar type Half but found Float"
),
data_collator=transformers.DataCollatorForLanguageModeling(tokenizer, mlm=False),
)
model.config.use_cache = False
trainer.train()
I marked the two lines that lead to this error with a comment. This seems to be more of an issue with the transfomers.Trainer class rather than peft itself because the error also occurs when I don't use peft.
No success even with the above. I am getting this issue only on V100-32G. Everything works fine on A100-40G. Seems like issue with bnb.
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
I am having a similar issue,
I am having the same issue
I am having the same issue too. it works on A100 but fail on V100 . Any idea how to fix it ?
@pacman100 hi bro, would you pls check and fix this issue, thx in advance.
When I run this code: https://github.com/huggingface/peft/blob/main/examples/int8_training/Finetune_opt_bnb_peft.ipynb by copying and pasting it into /home/jahangmar/peft_finetune_opt_bnb.py and then executing it, I get the following output:
I have the following libraries installed:
In the README there is a google colab notebook linked that contains very similar code to https://github.com/huggingface/peft/blob/main/examples/int8_training/Finetune_opt_bnb_peft.ipynb and it runs fine but I get the same error when I run the code on my system. I tried to mimic the environment in the google colab notebook by installing the same torch version (1.13.1, in a conda environment) but this did not change the result. I also tried to downgrade the transformers library and peft.