Open chukfinley opened 3 weeks ago
are there plans for a smaller model? if it really is 32.25 GB for the voice model it cant be run on consumer GPU. Also how many parameters is the model? and how to run it?
also is the 32.25 GB model just voice or also llama3?
The 32 GB includes the Llama 3 finetune. With the quantization described in #8 you should be able to run on a consumer GPU.
are there plans for a smaller model? if it really is 32.25 GB for the voice model it cant be run on consumer GPU. Also how many parameters is the model? and how to run it?