NVIDIA / ChatRTX

A developer reference project for creating Retrieval Augmented Generation (RAG) chatbots on Windows using TensorRT-LLM
Other
2.76k stars 339 forks source link

Update to Llama 3 8B model #55

Open EwoutH opened 6 months ago

EwoutH commented 6 months ago

It would be great if the LLaMa 2 13B AWQ 4bit quantized model currently used would be upgraded to the Llama 3 8B model. It can be quantized similarly. This would have several advantages:

The models are available at: https://huggingface.co/collections/meta-llama/meta-llama-3-66214712577ca38149ebb2b6

anujj commented 5 months ago

Thanks for the suggestions . Will check internally

shotelco commented 5 months ago

Any updates on this?

oscarbg commented 4 months ago

+1

3bagorion33 commented 3 weeks ago

+1

luizdequeiroz commented 3 weeks ago

+1