Closed Iory1998 closed 2 months ago
I've got similar error with a different model https://huggingface.co/YorkieOH10/granite-8b-code-instruct-Q8_0-GGUF
{
"title": "Failed to load model",
"cause": "llama.cpp error: 'error loading model vocabulary: unknown pre-tokenizer type: 'refact''",
"errorData": {
"n_ctx": 1500,
"n_batch": 512,
"n_gpu_layers": 37
},
"data": {
"memory": {
"ram_capacity": "31.70 GB",
"ram_unused": "9.69 GB"
},
"gpu": {
"type": "NvidiaCuda",
"vram_recommended_capacity": "16.00 GB",
"vram_unused": "14.87 GB"
},
"os": {
"platform": "win32",
"version": "10.0.22631",
"supports_avx2": true
},
"app": {
"version": "0.2.22",
"downloadsDir": "C:\\Users\\acc4k\\.cache\\lm-studio\\models"
},
"model": {}
}
}```
Pending support from llama.cpp - on-going discussion here:
Hi, I hope this message finds you well.
I recently faced this persistent problem where each time I try to run a Command-R based model, I get this error message and the models just doesn't load. I try loading the model on the Oobabooga webui, and it loads just fine,
The complete error message is as follows:
"llama.cpp error: 'error loading model vocabulary: unknown pre-tokenizer type: 'command-r''" Diagnostics info { "memory": { "ram_capacity": "31.75 GB", "ram_unused": "22.74 GB" }, "gpu": { "type": "NvidiaCuda", "vram_recommended_capacity": "24.00 GB", "vram_unused": "22.76 GB" }, "os": { "platform": "win32", "version": "10.0.22631", "supports_avx2": true }, "app": { "version": "0.2.22", "downloadsDir": "D:\LM Studio\models" }, "model": {} }
Model I tried: https://huggingface.co/bartowski/35b-beta-long-GGUF/blob/main/35b-beta-long-Q4_K_M.gguf https://huggingface.co/MarsupialAI/Coomand-R-35B-v1_iMatrix_GGUF/blob/main/Coomand-R-35B-v1_iQ3m.gguf https://huggingface.co/TheDrummer/Coomand-R-35B-v1-GGUF/blob/main/Coomand-R-35B-v1-Q3_K_M.gguf
Each time I try to load these models, I get the same error.
Could you please shed some light on the issue and provide a fix?
Thank you in advance :)