unslothai / unsloth

Finetune Llama 3.2, Mistral, Phi, Qwen & Gemma LLMs 2-5x faster with 80% less memory
https://unsloth.ai
Apache License 2.0
17.9k stars 1.24k forks source link

Add support for passing in `inputs_embeds` into `generate` function #862

Open jonflynng opened 3 months ago

jonflynng commented 3 months ago

I need to use the generate function by passing in inputs_embeds for a multi-modal model I'm building, I can't use input_ids. I see Unsloth doesn't currently support this. Would it be possible to default to the normal transformers inference and use the generate function without Unsloth?

Screenshot 2024-08-02 at 12 35 43
danielhanchen commented 3 months ago

There is a way to overwrite the code itself and allow input_embeds to be passed, but it'll be a bit of custom code - another way is to save the Unsloth model you just finetuned as a normal HF model, and then use HF directly (without Unsloth)

dabs9 commented 3 weeks ago

There is a way to overwrite the code itself and allow input_embeds to be passed, but it'll be a bit of custom code - another way is to save the Unsloth model you just finetuned as a normal HF model, and then use HF directly (without Unsloth)

The workaround works @danielhanchen but still would love that fast Unsloth inference :)

danielhanchen commented 3 weeks ago

I'll see what I can do, but it'll be a bit tough to edit HF directly :(