AttributeError: module 'transformers.generation.stopping_criteria' has no attribute 'MaxNewTokensCriteria'

hemanthkotaprolu commented 1 month ago

System Info

Optimum-habana - 1.11.1
Transformers - 4.46.0.dev0
SynapseAI - v1.15.0

Information

[ ] The official example scripts
[X] My own modified scripts

Tasks

[ ] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
[X] My own task or dataset (give details below)

Reproduction

I tried to instruct-finetune Llama-3.1-70b-instruct for custom usecase. The following is the code to reproduce.

import os
import numpy as np
from tqdm import tqdm

import torch
from torch.utils.data import DataLoader, RandomSampler, SequentialSampler

from datasets import Dataset, load_dataset
import transformers
from transformers import (
    AutoModelForCausalLM, 
    AutoTokenizer, 
    Trainer, 
    pipeline,
    DataCollatorForLanguageModeling
)

from peft import (
    LoraConfig, 
    PeftModel, 
    TaskType,
    get_peft_model
)
import accelerate
from optimum.habana import GaudiConfig, GaudiTrainer, GaudiTrainingArguments

instruction ="""
Read the input text carefully and identify the emotion of the text. After identifying the emotion output the serial number of the emotion. The emotions that you need to choose from along with definitions are:
0) Anger - Feeling upset, mad, or annoyed.
1) Anticipation - Looking forward to something, eager or excited about what's to come.
2) Joy - Feeling happy, cheerful, or pleased.
3) Trust - Feeling safe, confident, or positive towards someone or something.
4) Fear - Feeling scared, anxious, or worried.
5) Surprise - Feeling shocked or startled by something unexpected.
6) Sadness - Feeling unhappy, sorrowful, or down.
7) Disgust - Feeling disgusted, repulsed, or strongly disapproving.
8) Neutral - Feeling nothing in particular, indifferent or without strong emotions.

For example, if the identified emotion is Joy, the output has to be 2, since the serial number of joy is 2.
"""

dataset = load_dataset("hemanthkotaprolu/goemotions-plutchiks")

base_model = "meta-llama/Llama-3.1-70B-Instruct"
new_model = "./models/Llama31_70b_instruct_finetuned_e1"

model = AutoModelForCausalLM.from_pretrained(
    base_model,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True,
)

if "llama" in base_model.lower():
    model.config.use_cache = False # silence the warnings
    model.config.pretraining_tp = 1
    model.gradient_checkpointing_enable()
    model.generation_config.attn_softmax_bf16 = False
    model.generation_config.use_flash_attention = False
    model.generation_config.flash_attention_recompute = False
    model.generation_config.flash_attention_causal_mask = False
    model.generation_config.use_fused_rope = False

tokenizer = AutoTokenizer.from_pretrained(base_model, trust_remote_code=True)
tokenizer.padding_side = 'right'
tokenizer.pad_token = tokenizer.eos_token

model.resize_token_embeddings(len(tokenizer))
model.config.pad_token_id = tokenizer.pad_token_id

def template(item, eval):
    if eval:
        chat = [{'role': 'user', 'content': instruction + item['input']}]
    else:
        chat = [{"role": "user", "content": instruction + item["input"]},{"role": "assistant", "content": item["output"]}]

    prompt = tokenizer.apply_chat_template(chat, tokenize=False)
    return prompt

def tokenize_function(examples):
    results =  tokenizer(examples['text'], padding="max_length", truncation=True, return_tensors='pt', max_length=512)
    results['labels'] = results['input_ids']
    return results

train_dataset = dataset['train']
train_dataset = train_dataset.map(lambda item: {'text': template(item, eval=False)})

tokenized_dataset = train_dataset.map(tokenize_function, batched=True)\
        .remove_columns(['text', 'id','output', 'input'])

peft_config = LoraConfig(
                r=16,
                lora_alpha=8,
                lora_dropout=0.05,
                target_modules=["q_proj", "k_proj", "v_proj", "o_proj","gate_proj","down_proj","up_proj"],
                bias="none",
                task_type=TaskType.CAUSAL_LM,
                )

lora_model = get_peft_model(model, peft_config)

lora_model.print_trainable_parameters()
gaudi_config = GaudiConfig()
gaudi_config.use_fused_adam = True
gaudi_config.use_fused_clip_norm = True

data_collator = DataCollatorForLanguageModeling(tokenizer, pad_to_multiple_of=8, return_tensors="pt", mlm=False)

training_arguments = GaudiTrainingArguments(
        evaluation_strategy='epoch',
        save_strategy="epoch",
        output_dir="./logs/Llama3_8b_instruct_finetuned_e3",
        num_train_epochs=3,
        gradient_accumulation_steps = 4,
        per_device_train_batch_size=8,
        per_device_eval_batch_size=8,
        lr_scheduler_type ='cosine',
        learning_rate=1e-5,
        weight_decay=0.01,
        warmup_steps = 10,
        use_habana=True,
        use_lazy_mode=True,
    )

trainer = GaudiTrainer(
            model=lora_model,
            gaudi_config=gaudi_config,
            args=training_arguments,
            train_dataset=tokenized_dataset,
            data_collator=data_collator,
            )

trainer.train()

output_dir = f"./fine_tuned_model/{new_model}"
model.save_pretrained(output_dir)
tokenizer.save_pretrained(output_dir)

Expected behavior

The following is the error I am encountering.

Please let me know if anything else is required. Thanks in advance.

regisss commented 1 month ago

Optimum Habana is not compatible with the version of Transformers you have, see here: https://github.com/huggingface/optimum-habana/blob/v1.11.1/setup.py#L32 Please reinstall it with:

pip install optimum-habana==1.11.1

hemanthkotaprolu commented 1 month ago

Thanks! pip install -U optimum-habana==1.13.1 worked for me.

huggingface / optimum-habana

AttributeError: module 'transformers.generation.stopping_criteria' has no attribute 'MaxNewTokensCriteria' #1372

System Info

Information

Tasks

Reproduction

Expected behavior