Closed lcwxz1989 closed 3 months ago
Hi @lcwxz1989 I tried it on my computer and it is working, let me share my steps for your reference
https://huggingface.co/TheBloke/deepseek-coder-1.3b-base-GGUF/resolve/main/deepseek-coder-1.3b-base.Q8_0.gguf
.tabby/models/TabbyML
, I create a new folder DeepseekCoder-1.3B
TabbyML/DeepseekCoder-1.3B
, I create a tabby.json
, the content is
{"name":"DeepseekCoder-1.3B","prompt_template":"<|fim▁begin|>{prefix}<|fim▁hole|>{suffix}<|fim▁end|>","urls":["https://huggingface.co/TheBloke/deepseek-coder-1.3b-base-GGUF/resolve/main/deepseek-coder-1.3b-base.Q8_0.gguf"],"sha256":"9fcdcb283ef5b1d80ec7365b307c1ceab0c0f8ea079b49969f7febc06a11bccd"}
TabbyML/DeepseekCoder-1.3B
, create a new folder ggml
and put the gguf file into it, rename it to model.gguf
tabby serve --model DeepseekCoder-1.3B --device cuda
, and after a while, I can see it is runningHi @lcwxz1989 I tried it on my computer and it is working, let me share my steps for your reference
- I donwload gguf from
https://huggingface.co/TheBloke/deepseek-coder-1.3b-base-GGUF/resolve/main/deepseek-coder-1.3b-base.Q8_0.gguf
- Go to
.tabby/models/TabbyML
, I create a new folderDeepseekCoder-1.3B
- inside
TabbyML/DeepseekCoder-1.3B
, I create atabby.json
, the content is{"name":"DeepseekCoder-1.3B","prompt_template":"<|fim▁begin|>{prefix}<|fim▁hole|>{suffix}<|fim▁end|>","urls":["https://huggingface.co/TheBloke/deepseek-coder-1.3b-base-GGUF/resolve/main/deepseek-coder-1.3b-base.Q8_0.gguf"],"sha256":"9fcdcb283ef5b1d80ec7365b307c1ceab0c0f8ea079b49969f7febc06a11bccd"}
- Inside
TabbyML/DeepseekCoder-1.3B
, create a new folderggml
and put the gguf file into it, rename it tomodel.gguf
- In the offline environment, I run
tabby serve --model DeepseekCoder-1.3B --device cuda
, and after a while, I can see it is running
I think I have already did all of this things, I run the tabby is a docker image because of offline env, how do you run the tabby.I think the docker maybe lost some ENV or something my docker is tabbyml/tabby:0.12.1-rc.0, sudo docker run --runtime=nvidia -it --gpus all -p 8888:8080 -v $HOME/.tabby:/data tabbyml/tabby:0.12.1-rc.0 serve --model StarCoder-1B --device cuda the $HOME/.tabby dir tree is the picture above
I used docker as well, the command I used is:
docker run -it --gpus all \
-p 8080:8080 -v $HOME/.tabby:/data \
tabbyml/tabby serve --model DeepseekCoder-1.3B --device cuda
I'm running from windows WSL2 (ubuntu20)
I used docker as well, the command I used is:
docker run -it --gpus all \ -p 8080:8080 -v $HOME/.tabby:/data \ tabbyml/tabby serve --model DeepseekCoder-1.3B --device cuda
I'm running from windows WSL2 (ubuntu20)
A very important condition is offline, I want to know if it work well when you offline and put the pre downloaded model to the .tabby dir what is the $HOME/.tabby dir tree , I really do not know where is wrong
yeah I start the server without the internet, my computer is using 3060TI, but I suppose that it has nothing to do with the GPU.
Not sure if you need to double check the content of tabby.json
?
yeah I start the server without the internet, my computer is using 3060TI, but I suppose that it has nothing to do with the GPU. Not sure if you need to double check the content of
tabby.json
? My tabb.json is like this, and I do not know where is wrong, in our project, there is no doc to describe how to install offline, nor the commond support verbose option, so I do not know how to diagnosis will is wrong
The tabby.json looks correct If you are using Docker, you might want to try this method. https://tabby.tabbyml.com/blog/2024/03/25/deploy-tabby-in-air-gapped-environment-with-docker/
The tabby.json looks correct If you are using Docker, you might want to try this method. https://tabby.tabbyml.com/blog/2024/03/25/deploy-tabby-in-air-gapped-environment-with-docker/
Yes, but I do not have ubuntu online, so I download tabby in macos then run tabby to download the model.gguf then transfer the model.gguf to the offline ubuntu /data/models/ dir
Not sure if it's due to the different environment. Have you tried the Docker method described in the post?
Not sure if it's due to the different environment. Have you tried the Docker method described in the post?
I do not have ubuntu env, and I think the post only download the model.gguf into the dir, so I download the model.guff in macos and copy the model.guff into dir of docker
I do not know what the dir tree in the docker, can you give me the path where the gguf located
You can find them here in the docker
Closing stale issue
Describe the problem: metal: system OS: ubuntu 20.04 gpu: nvidia Geforce RTX 2060 cuda: 12.3 cuda driver version: 545.23.08 env: offline, no internet docker env: put models into /models
To Reproduce: cpu percent: 100%