NotImplementedError: Make sure that a `_reorder_cache` function is correctly implemented in transformers.models.llama.modeling_llama to enable beam search for <class 'transformers.models.llama.modeling_llama.LlamaForCausalLM'>

unslothai / unsloth

Finetune Llama 3.2, Mistral, Phi, Qwen 2.5 & Gemma LLMs 2-5x faster with 80% less memory

https://unsloth.ai

Apache License 2.0

18.42k stars 1.29k forks source link

Open kiranpedvak opened 1 month ago

kiranpedvak commented 1 month ago

when i tried add num_beams=3 while inferencing getting this issue

danielhanchen commented 1 month ago

Will investigate!

yelboudouri commented 3 weeks ago

Any updates on this issue??