Stability AI's Stable Code 3B support

clickclack777 commented 10 months ago

Please describe the feature you want Please add Stability AI's Stable Code 3B https://huggingface.co/stabilityai/stable-code-3b

Please reply with a 👍 if you want this feature.

wsxiaoys commented 10 months ago

Since stability ai folks provides gguf quantization, it's easy to integrate by following: https://slack.tabbyml.com/Gd5zV1P69JN/how-can-i-indicate-a-custom-model-to-tabbyml

clickclack777 commented 10 months ago

can I point to a local gguf file like this? how about the "sha256"?

Screenshot 2024-01-17 at 23 02 17

wsxiaoys commented 10 months ago

For running tabby on downloaded model, you could refer https://github.com/TabbyML/tabby/blob/main/MODEL_SPEC.md directly

clickclack777 commented 10 months ago

Can the gguf file be located elsewhere other than in the Tabby model file folder? Want to use the gguf file LM Studio has already downloaded to save disk space.

wsxiaoys commented 10 months ago

Yes, but you still need to organize it into a directory format specified in model spec.

Want to use the gguf file LM Studio has already downloaded to save disk space.

Shall be good to create a symbolic link?

clickclack777 commented 10 months ago

So if I:

Setup a folder called "StableCode-3B" with a sub-folder "ggml" and place the symbolic link here.
Copy & paste another "tabby.json" file into "StableCode-3B" folder and then modify the model name, should the url link be to the symbolic file? How about the "sha256" value?
Model.json url also point to the symbolic link.

it should work?

clickclack777 commented 10 months ago

Apparently not. Deletes model rows to all local file.

thread 'main' panicked at crates/tabby-common/src/registry.rs:87:9: Invalid model id TabbyML/StableCode-3B/ stack backtrace: 0: 0x105b4d720 - ::fmt::h6d4268b2ed62fb94 1: 0x105b7115c - core::fmt::write::h5d55d44549819258 2: 0x105b49f50 - std::io::Write::write_fmt::hc515897f91abd6cf 3: 0x105b4d560 - std::sys_common::backtrace::print::h2c300c1ebedfc73c 4: 0x105b4ee4c - std::panicking::default_hook::{{closure}}::h0aa9be5c44269370 5: 0x105b4eb78 - std::panicking::default_hook::h2c0ef097934ee9e6 6: 0x105b4f394 - std::panicking::rust_panic_with_hook::h84c8637cb6e56008 7: 0x105b4f2a0 - std::panicking::begin_panic_handler::{{closure}}::h25482adda06c7b7f 8: 0x105b4dbac - std::sys_common::backtrace::__rust_end_short_backtrace::h0c6f3beb22324a29 9: 0x105b4f004 - _rust_begin_unwind 10: 0x105c5d0f4 - core::panicking::panic_fmt::h9072a0246ecafd14 11: 0x10554b800 - tabby_common::registry::parse_model_id::h36a479eafd05fc23 12: 0x104b80154 - tabby_download::download_model::{{closure}}::h9eebbf65aa472130 13: 0x104b9fc14 - tabby::services::model::download_model_if_needed::{{closure}}::hbeb5a21d1fa4180a 14: 0x104ba0128 - tabby::serve::main::{{closure}}::h7cb653b915a7cc63 15: 0x104b95f68 - tokio::runtime::runtime::Runtime::block_on::h63288b1efbfe5842 16: 0x104ca2ffc - tabby::main::ha5dc08e503a2bd27 17: 0x104b89c38 - std::sys_common::backtrace::__rust_begin_short_backtrace::h0b5db4848f3c85bc 18: 0x104d520b4 - std::rt::lang_start::{{closure}}::h42a0649d95a186a0 19: 0x105b42a54 - std::rt::lang_start_internal::hadaf077a6dd0140b 20: 0x104ca30fc - _main

wsxiaoys commented 10 months ago

Hi, could you share:

the structure of local model dir (maybe output of find)
the command you used invoking tabby

clickclack777 commented 10 months ago

URL to original file file:///Users/click/.cache/lm-studio/models/TheBloke/deepseek-coder-6.7B-instruct-GGUF/deepseek-coder-6.7b-instruct.Q8_0.gguf

URL to symbolic link file:///Users/click/.tabby/models/TabbyML/DeepseekCoder-6.7B/ggml/deepseek-coder-6.7b-instruct.Q8_0.gguf

RUST_BACKTRACE=full tabby serve --device metal --model TabbyML/DeepseekCoder-6.7B/

wsxiaoys commented 10 months ago

it seems your directory structure doesn't follow https://github.com/TabbyML/tabby/blob/main/MODEL_SPEC.md
When passing local directory to --model, please use the absolute full path as argument

fungiboletus commented 10 months ago

I'm not able to load the model.

I tried the q8 quantised files of both https://huggingface.co/brittlewis12/stable-code-3b-GGUF and https://huggingface.co/TheBloke/stable-code-3b-GGUF but llama.cpp is unable to load the file.

$ TABBY_DISABLE_USAGE_COLLECTION=1 tabby serve --device metal --model /Users/fungiboletus/Desktop/StableCode-3B
2024-01-18T08:07:16.405013Z  INFO tabby::services::model: crates/tabby/src/services/model.rs:80: Loading model from local path /Users/fungiboletus/Desktop/StableCode-3B
2024-01-18T08:07:16.405333Z  INFO tabby::serve: crates/tabby/src/serve.rs:111: Starting server, this might takes a few minutes...
2024-01-18T08:07:16.420777Z ERROR llama_cpp_bindings: crates/llama-cpp-bindings/src/lib.rs:62: Unable to load model: /Users/fungiboletus/Desktop/StableCode-3B/ggml/q8_0.v2.gguf

$ ls -R /Users/fungiboletus/Desktop/StableCode-3B   
ggml        tabby.json

./ggml:
q8_0.v2.gguf

I also tried to convert the GGUF model again using llama.cpp, in case the format has changed, but this wasn't helpful.

$ /Users/fungiboletus/Desktop/llama.cpp/quantize q8_0.v2.old.gguf q8_0.v2.gguf COPY

clickclack777 commented 10 months ago

file:///Users/click/.tabby/models/TabbyML/DeepseekCoder-6.7B/ggml/deepseek-coder-6.7b-instruct.Q8_0.gguf

How should I structure it otherwise? Please explain in comprehensive steps with reference to the URLs I've provided.
This doens't work if that is what you were referring to "RUST_BACKTRACE=full tabby serve --device metal --model file:///Users/click/.tabby/models/TabbyML/DeepseekCoder-6.7B/ggml/deepseek-coder-6.7b-instruct.Q8_0.gguf"

thread 'main' panicked at crates/tabby-common/src/registry.rs:87:9: Invalid model id file:///Users/click/.tabby/models/TabbyML/DeepseekCoder-6.7B/ggml/deepseek-coder-6.7b-instruct.Q8_0.gguf stack backtrace: 0: 0x103c39720 - ::fmt::h6d4268b2ed62fb94 1: 0x103c5d15c - core::fmt::write::h5d55d44549819258 2: 0x103c35f50 - std::io::Write::write_fmt::hc515897f91abd6cf 3: 0x103c39560 - std::sys_common::backtrace::print::h2c300c1ebedfc73c 4: 0x103c3ae4c - std::panicking::default_hook::{{closure}}::h0aa9be5c44269370 5: 0x103c3ab78 - std::panicking::default_hook::h2c0ef097934ee9e6 6: 0x103c3b394 - std::panicking::rust_panic_with_hook::h84c8637cb6e56008 7: 0x103c3b2a0 - std::panicking::begin_panic_handler::{{closure}}::h25482adda06c7b7f 8: 0x103c39bac - std::sys_common::backtrace::__rust_end_short_backtrace::h0c6f3beb22324a29 9: 0x103c3b004 - _rust_begin_unwind 10: 0x103d490f4 - core::panicking::panic_fmt::h9072a0246ecafd14 11: 0x103637800 - tabby_common::registry::parse_model_id::h36a479eafd05fc23 12: 0x102c6c154 - tabby_download::download_model::{{closure}}::h9eebbf65aa472130 13: 0x102c8bc14 - tabby::services::model::download_model_if_needed::{{closure}}::hbeb5a21d1fa4180a 14: 0x102c8c128 - tabby::serve::main::{{closure}}::h7cb653b915a7cc63 15: 0x102c81f68 - tokio::runtime::runtime::Runtime::block_on::h63288b1efbfe5842 16: 0x102d8effc - tabby::main::ha5dc08e503a2bd27 17: 0x102c75c38 - std::sys_common::backtrace::__rust_begin_short_backtrace::h0b5db4848f3c85bc 18: 0x102e3e0b4 - std::rt::lang_start::{{closure}}::h42a0649d95a186a0 19: 0x103c2ea54 - std::rt::lang_start_internal::hadaf077a6dd0140b 20: 0x102d8f0fc - _main

anoldguy commented 10 months ago

I get the same as @fungiboletus with my custom registry.

tabby serve --device metal --model anoldguy/StableCode-3B
Writing to new file.
🎯 Downloaded https://huggingface.co/stabilityai/stable-code-3b/resolve/main/stable-code-3b-Q6_K.gguf to /Users/nathan/.tabby/models/anoldguy/StableCode-3B/ggml/q8_0.v2.gguf.tmp
   00:00:23 ▕████████████████████▏ 2.14 GiB/2.14 GiB  93.90 MiB/s  ETA 0s.                                                              2024-01-18T14:05:47.674093Z  INFO tabby::serve: crates/tabby/src/serve.rs:111: Starting server, this might takes a few minutes...
2024-01-18T14:05:47.674908Z  INFO tabby::services::code: crates/tabby/src/services/code.rs:53: Index is ready, enabling server...    
2024-01-18T14:05:47.772956Z ERROR llama_cpp_bindings: crates/llama-cpp-bindings/src/lib.rs:62: Unable to load model: /Users/nathan/.tabby/models/anoldguy/StableCode-3B/ggml/q8_0.v2.gguf

I'm using this as the definition, but I'm unsure about the prompt template. 🤔

{
    "name": "StableCode-3B",
    "license_name": "STABILITY AI NON-COMMERCIAL RESEARCH COMMUNITY LICENSE",
    "license_url": "https://huggingface.co/stabilityai/stable-code-3b/blob/main/LICENSE",
    "prompt_template": "<fim_prefix>{prefix}<fim_suffix>{suffix}<fim_middle>",
    "provider_url": "https://huggingface.co/stabilityai/stable-code-3b",
    "urls": [
      "https://huggingface.co/stabilityai/stable-code-3b/resolve/main/stable-code-3b-Q6_K.gguf"
    ],
    "sha256": "9749daf176491c33a7318660f1637c97674b0070d81740be8763b2811c495bfc"
  }

wsxiaoys commented 10 months ago

Thanks for doing the experiment, it seems the reason that stable lm support is added after our current checkpoint of llama.cpp.

this should be fixed after #434 is done

SachiaLanlus commented 10 months ago

Thanks for doing the experiment, it seems the reason that stable lm support is added after our current checkpoint of llama.cpp.

this should be fixed after #434 is done

Is there any plan to support Stable Code 3B officially in the future? Or we still need to use costumed model registry to do the magic?

wsxiaoys commented 10 months ago

We've bumped llama.cpp version and it has been released in https://github.com/TabbyML/tabby/releases/tag/nightly

Please give it a try to see if it works with StableCode-3B

HFrost0 commented 10 months ago

I tried the nightly and it works for me, my registry

wsxiaoys commented 10 months ago

Fixed in v0.8.0

TabbyML / tabby

Stability AI's Stable Code 3B support #1230

Please describe the feature you want Please add Stability AI's Stable Code 3B https://huggingface.co/stabilityai/stable-code-3b