rajatkrishna / chat-llama3

Local LLama3 Inference with OpenVINO
MIT License
1 stars 1 forks source link

Llama 3.1 support #1

Open taxmeifyoucan opened 1 month ago

taxmeifyoucan commented 1 month ago

I am having trouble with running latest llama 3.1 on openvino. I am trying to use optimum-intel to convert the new model but I always fail with an error. Would be great to have 3.1 also already quantized and just get it from hugging face. Not sure if the tool needs more work to update to latest version of the model.

rajatkrishna commented 1 week ago

Can you try upgrading transformers library to the latest version before quantizing using optimum-intel? Here is the quantized Llama 3.1 model. Let me know if you're facing any other issues.

taxmeifyoucan commented 1 week ago

Thanks, I already made it work and reported some issues to optimum project. But still would be nice to update this repo with latest version so I will keep it open