Closed Immortalin closed 11 months ago
Exllama v2 seems to be working now. Would you like to test this out? Simply add version=2 to ExllamaModel as below:
your_gptq_model = ExllamaModel(
version=2,
model_path="TheBloke/MythoMax-L2-13B-GPTQ", # automatic download
max_total_tokens=4096,
)
Thank you!
https://github.com/turboderp/exllamav2