Open AndreaLombax opened 10 months ago
I ran into this. I can recreate it on windows with a 1660ti and python 3.11 full install from python.org by running the following:
# Create a clean venv
pip install ctransformers
pip install ctransformers[cuda]
The second installation will bring in the nvidia dependencies but running a very similar code snippet the model never actually loads into GPU memory in task manager.
Does not appear tied to models or model installation method (I was scratching my head with the same model from op pulled manuall, the dolphin mistral 2.1 gguf model pulled manually, and several variations of llama2 pulled automatically using the hugging face pattern the author references in the README.md).
Hi, I'm having trouble with Mistral because the model is not loading on GPU but it is only running on CPU.
That's the code:
Versions:
I have two NVIDIA A16 16GB, and the load is only 4mb for each.