juncongmoo / pyllama

LLaMA: Open and Efficient Foundation Language Models
GNU General Public License v3.0
2.8k stars 312 forks source link

12GB card #109

Open arthurwolf opened 1 year ago

arthurwolf commented 1 year ago

My card has 12GB of RAM, that's not a case covered anywhere i could see. Would this allow me to do more (run the larger models, etc)? Any chances to get instructions for larger cards?

Thanks!

miko8422 commented 8 months ago

You can run Quantized 7B model on your pc, but with the full version which is the not quantized version of 7B, you won't be able to run it, because it will literally eat 12G of RAM. Just use a cloud server or just use the quantized version if you want to explore prompt engineering.

miko8422 commented 8 months ago

You can run Quantized 7B model on your pc, but with the full version which is the not quantized version of 7B, you won't be able to run it, because it will literally eat 12G of RAM. Just use a cloud server or just use the quantized version if you want to explore prompt engineering.

I also have 12G of RAM on my own pc, but I wasn't been able to run the offical 7B model. By the way, I'm using 4070... I start to hate why I didn't bought the 4090 to run this model because I wan't to do some tuning on it.