unslothai / unsloth

Finetune Llama 3.2, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory
https://unsloth.ai
Apache License 2.0
17.7k stars 1.23k forks source link

beam search does not work for gemma2b #923

Open world2vec opened 2 months ago

world2vec commented 2 months ago

Env: torch2.4 cuda 12.4 unsloth main below is the code errored

from unsloth import FastLanguageModel
import torch

model_id="unsloth/gemma-2-2b-it-bnb-4bit"
model, tokenizer = FastLanguageModel.from_pretrained(model_id, dtype=torch.float16, use_cache=False,
                                                         max_seq_length=1024, load_in_4bit=True)
FastLanguageModel.for_inference(model)
input_text = "Write me a poem about Machine Learning."
input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")
outputs = model.generate(**input_ids, num_beams=2, max_new_tokens=10)

error:


NotImplementedError: Make sure that a `_reorder_cache` function is correctly implemented in transformers.models.gemma2.modeling_gemma2 to enable beam search for <class 'transformers.models.gemma2.modeling_gemma2.Gemma2ForCausalLM'>

If use huggingface code there is no error:

from unsloth import FastLanguageModel
import torch

model_id="unsloth/gemma-2-2b-it-bnb-4bit"
model, tokenizer = FastLanguageModel.from_pretrained(model_id, dtype=torch.float16, use_cache=False,
                                                         max_seq_length=1024, load_in_4bit=True)
FastLanguageModel.for_inference(model)
input_text = "Write me a poem about Machine Learning."
input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")
outputs = model.generate(**input_ids, num_beams=2, max_new_tokens=10)
danielhanchen commented 2 months ago

Will check this out!

practicingman commented 2 months ago

this happens with unsloth/Meta-Llama-3.1-8B too. when I add use_cache=False to model.generate. it raises

RuntimeError: The size of tensor a (32) must match the size of tensor b (1300) at non-singleton dimension 1

danielhanchen commented 2 months ago

Hmm ok will reinvestigate!

anderleich commented 1 week ago

Any news on this? I get the same error

shimmyshimmer commented 5 days ago

Still investigating this!