Could you put the actual text for command to run inference with Quantization?
I cannot see the image because I'm blind and uses screen reader.
Readme says "With quantization, you can run LLaMA with a 4GB memory GPU." Then it has two pictures.
Thanks!
Could you put the actual text for command to run inference with Quantization? I cannot see the image because I'm blind and uses screen reader. Readme says "With quantization, you can run LLaMA with a 4GB memory GPU." Then it has two pictures. Thanks!