rustformers / llm

[Unmaintained, see README] An ecosystem of Rust libraries for working with large language models
https://docs.rs/llm/latest/llm/
Apache License 2.0
6.07k stars 355 forks source link

Support for Llama 70-B #409

Closed AmineDiro closed 1 year ago

AmineDiro commented 1 year ago

Hello, Solves #402 .

This is a temporary fix for supporting the Llama-2 70B model. I wanted to open a draft PR to get your feedbacks on this implementation for supporting the n_gqa params :

Here is the llama-2-70B--chat.ggmlv3.q4_0.bin model loaded on A100 GPU : Annotation 2023-08-17 203513

LLukas22 commented 1 year ago

Looks good, some small nitpicks but if the CI passes it should be good to go 👍

AmineDiro commented 1 year ago

@LLukas22 Thanks for the review 👍🏼 !

LLukas22 commented 1 year ago

Thanks for implementing this :D