Add Quantization Code - Githubissues

ypeleg / llama

User-friendly LLaMA: Train or Run the model using PyTorch. Nothing else.

330 stars 60 forks source link

Open htcml opened 1 year ago

htcml commented 1 year ago

Are you able to add quantization code so that the model can be run on a smaller GPU?