unslothai / unsloth

Finetune Llama 3.1, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory
https://unsloth.ai
Apache License 2.0
15.42k stars 1.04k forks source link

Deploying to hugging face inference endpoint not working #860

Closed sourceful-tolu closed 1 month ago

sourceful-tolu commented 1 month ago

Hi,

I've successfully trained llama with unsloth using my custom dataset. Also tested inference in google colab and successfully saved the merged 16bit model to hugging face.

However, i'm stuck trying to deploy to inference endpoint. To check if this was an issue with my trained model, i tried deploying the Unsloth llama model (unsloth/Meta-Llama-3.1-8B-Instruct-bnb-4bit ) directly to an inference endpoint and I keep getting the same failure

Is this a build configuration issue? Or I've done something wrong with the advanced configuration when creating the endpoint

Thoughts?

image image
sourceful-tolu commented 1 month ago

Found the fix!

Known issue : https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct/discussions/34

Nothing to do with Unsloth. Thanks and keep up the good work.

danielhanchen commented 1 month ago

Great you solved it!