AIAnytime / Zephyr-7B-beta-RAG-Demo

Zephyr 7B beta RAG Demo inside a Gradio app powered by BGE Embeddings, ChromaDB, and Zephyr 7B Beta LLM.
MIT License
35 stars 22 forks source link

The model is really slow locally. #2

Open Sabk0926 opened 1 year ago

Sabk0926 commented 1 year ago

I am running a model on RTX A1000 GPU but it takes 60 seconds to get an answer

BakingBrains commented 12 months ago

@Sabk0926 Its running on CPU. Run it on GPU by changing the model_kwargs parameter.

Sabk0926 commented 12 months ago

No luck running on GPU either. I can try running it on Google colabs. Maybe you can give me an example of how to use kwargs.