Closed naveenmaan closed 1 week ago
Have you tried using ollama? This should simplify a lot of things for you to run it on CPU.
I want a direct approach instead of third-party lib.
This is not the intended use of this specific repository. The goal of llama-models
is to show the architecture in its simplest form. Inference solutions can get extremely complicated because there's a multitude of environments. There are various high quality implementations and Ollama is one of the best in our opinion for running on CPU.
Hello, after downloading the 3.2 1B model. I am trying to run the model on CPU but getting various error that I do not have GPU or cuda is missing. I tried to change the code and all but still facing the issue.
My task is to run the 1B model on my CPU.
Can anyone help me?