Closed clickclack777 closed 10 months ago
Since stability ai folks provides gguf quantization, it's easy to integrate by following: https://slack.tabbyml.com/Gd5zV1P69JN/how-can-i-indicate-a-custom-model-to-tabbyml
can I point to a local gguf file like this? how about the "sha256"?
For running tabby on downloaded model, you could refer https://github.com/TabbyML/tabby/blob/main/MODEL_SPEC.md directly
Can the gguf file be located elsewhere other than in the Tabby model file folder? Want to use the gguf file LM Studio has already downloaded to save disk space.
Yes, but you still need to organize it into a directory format specified in model spec.
Want to use the gguf file LM Studio has already downloaded to save disk space.
Shall be good to create a symbolic link?
So if I:
it should work?
Apparently not. Deletes model rows to all local file.
thread 'main' panicked at crates/tabby-common/src/registry.rs:87:9:
Invalid model id TabbyML/StableCode-3B/
stack backtrace:
0: 0x105b4d720 -
Hi, could you share:
find)
URL to original file file:///Users/click/.cache/lm-studio/models/TheBloke/deepseek-coder-6.7B-instruct-GGUF/deepseek-coder-6.7b-instruct.Q8_0.gguf
URL to symbolic link file:///Users/click/.tabby/models/TabbyML/DeepseekCoder-6.7B/ggml/deepseek-coder-6.7b-instruct.Q8_0.gguf
RUST_BACKTRACE=full tabby serve --device metal --model TabbyML/DeepseekCoder-6.7B/
--model
, please use the absolute full path as argumentI'm not able to load the model.
I tried the q8 quantised files of both https://huggingface.co/brittlewis12/stable-code-3b-GGUF and https://huggingface.co/TheBloke/stable-code-3b-GGUF but llama.cpp is unable to load the file.
$ TABBY_DISABLE_USAGE_COLLECTION=1 tabby serve --device metal --model /Users/fungiboletus/Desktop/StableCode-3B
2024-01-18T08:07:16.405013Z INFO tabby::services::model: crates/tabby/src/services/model.rs:80: Loading model from local path /Users/fungiboletus/Desktop/StableCode-3B
2024-01-18T08:07:16.405333Z INFO tabby::serve: crates/tabby/src/serve.rs:111: Starting server, this might takes a few minutes...
2024-01-18T08:07:16.420777Z ERROR llama_cpp_bindings: crates/llama-cpp-bindings/src/lib.rs:62: Unable to load model: /Users/fungiboletus/Desktop/StableCode-3B/ggml/q8_0.v2.gguf
$ ls -R /Users/fungiboletus/Desktop/StableCode-3B
ggml tabby.json
./ggml:
q8_0.v2.gguf
I also tried to convert the GGUF model again using llama.cpp, in case the format has changed, but this wasn't helpful.
$ /Users/fungiboletus/Desktop/llama.cpp/quantize q8_0.v2.old.gguf q8_0.v2.gguf COPY
file:///Users/click/.tabby/models/TabbyML/DeepseekCoder-6.7B/ggml/deepseek-coder-6.7b-instruct.Q8_0.gguf
How should I structure it otherwise? Please explain in comprehensive steps with reference to the URLs I've provided.
This doens't work if that is what you were referring to "RUST_BACKTRACE=full tabby serve --device metal --model file:///Users/click/.tabby/models/TabbyML/DeepseekCoder-6.7B/ggml/deepseek-coder-6.7b-instruct.Q8_0.gguf"
thread 'main' panicked at crates/tabby-common/src/registry.rs:87:9:
Invalid model id file:///Users/click/.tabby/models/TabbyML/DeepseekCoder-6.7B/ggml/deepseek-coder-6.7b-instruct.Q8_0.gguf
stack backtrace:
0: 0x103c39720 -
I get the same as @fungiboletus with my custom registry.
tabby serve --device metal --model anoldguy/StableCode-3B
Writing to new file.
🎯 Downloaded https://huggingface.co/stabilityai/stable-code-3b/resolve/main/stable-code-3b-Q6_K.gguf to /Users/nathan/.tabby/models/anoldguy/StableCode-3B/ggml/q8_0.v2.gguf.tmp
00:00:23 ▕████████████████████▏ 2.14 GiB/2.14 GiB 93.90 MiB/s ETA 0s. 2024-01-18T14:05:47.674093Z INFO tabby::serve: crates/tabby/src/serve.rs:111: Starting server, this might takes a few minutes...
2024-01-18T14:05:47.674908Z INFO tabby::services::code: crates/tabby/src/services/code.rs:53: Index is ready, enabling server...
2024-01-18T14:05:47.772956Z ERROR llama_cpp_bindings: crates/llama-cpp-bindings/src/lib.rs:62: Unable to load model: /Users/nathan/.tabby/models/anoldguy/StableCode-3B/ggml/q8_0.v2.gguf
I'm using this as the definition, but I'm unsure about the prompt template. 🤔
{
"name": "StableCode-3B",
"license_name": "STABILITY AI NON-COMMERCIAL RESEARCH COMMUNITY LICENSE",
"license_url": "https://huggingface.co/stabilityai/stable-code-3b/blob/main/LICENSE",
"prompt_template": "<fim_prefix>{prefix}<fim_suffix>{suffix}<fim_middle>",
"provider_url": "https://huggingface.co/stabilityai/stable-code-3b",
"urls": [
"https://huggingface.co/stabilityai/stable-code-3b/resolve/main/stable-code-3b-Q6_K.gguf"
],
"sha256": "9749daf176491c33a7318660f1637c97674b0070d81740be8763b2811c495bfc"
}
Thanks for doing the experiment, it seems the reason that stable lm support is added after our current checkpoint of llama.cpp.
this should be fixed after #434 is done
Thanks for doing the experiment, it seems the reason that stable lm support is added after our current checkpoint of llama.cpp.
this should be fixed after #434 is done
Is there any plan to support Stable Code 3B officially in the future? Or we still need to use costumed model registry to do the magic?
We've bumped llama.cpp version and it has been released in https://github.com/TabbyML/tabby/releases/tag/nightly
Please give it a try to see if it works with StableCode-3B
Please describe the feature you want Please add Stability AI's Stable Code 3B https://huggingface.co/stabilityai/stable-code-3b
Please reply with a 👍 if you want this feature.