The server doesn't keep itself alive on my Windows 10 docker run

Mayorc1978 commented 8 months ago

Describe the bug I install the container using the default command, at the end of the various downloads it says it's running the server but immediately shuts down.

Information about your version Please provide output of tabby --version

+---------------- | Processes: | GPU GI CI | ID ID |================ | 0 N/A N/A | 0 N/A N/A | 0 N/A N/A | 0 N/A N/A | 0 N/A N/A | 0 N/A N/A | 0 N/A N/A | 0 N/A N/A | 0 N/A N/A | 0 N/A N/A | 0 N/A N/A | 0 N/A N/A | 0 N/A N/A | 0 N/A N/A | 0 N/A N/A | 0 N/A N/A | 0 N/A N/A | 0 N/A N/A | 0 N/A N/A +---------------- -----------------------------------------------------------------------+ | PID Type Process name GPU Memory | Usage | =======================================================================| 5372 C+G ...crosoft\Edge\Application\msedge.exe N/A | 6728 C+G C:\Windows\explorer.exe N/A | 9588 C+G ...oogle\Chrome\Application\chrome.exe N/A | 11084 C+G ...wekyb3d8bbwe\XboxGameBarWidgets.exe N/A | 11860 C+G ...Data\Local\Programs\Tabby\Tabby.exe N/A | 11892 C+G ...\Docker\frontend\Docker Desktop.exe N/A | 12156 C+G ...12.0_x64__8wekyb3d8bbwe\GameBar.exe N/A | 12888 C+G ...FancyZones\PowerToys.FancyZones.exe N/A | 13160 C+G ...auncher\PowerToys.PowerLauncher.exe N/A | 14040 C+G ...aming\Telegram Desktop\Telegram.exe N/A | 14108 C+G ...GeForce Experience\NVIDIA Share.exe N/A | 14476 C+G ...5n1h2txyewy\ShellExperienceHost.exe N/A | 16424 C+G ...ekyb3d8bbwe\PhoneExperienceHost.exe N/A | 16768 C+G ...CBS_cw5n1h2txyewy\TextInputHost.exe N/A | 17280 C+G ...cal\Microsoft\OneDrive\OneDrive.exe N/A | 17324 C+G ....Search_cw5n1h2txyewy\SearchApp.exe N/A | 19444 C+G ...mTray\FluentTerminal.SystemTray.exe N/A | 23264 C+G ...Programs\Microsoft VS Code\Code.exe N/A | 23724 C+G ...2.0_x64__cv1g1gvanyjgm\WhatsApp.exe N/A | -----------------------------------------------------------------------+

Additional context Command tested + output: docker run -it --gpus all -p 8080:8080 -v $HOME/.tabby:/data tabbyml/tabby serve --model TabbyML/DeepseekCoder-6.7B --device cuda

Unable to find image 'tabbyml/tabby:latest' locally latest: Pulling from tabbyml/tabby aece8493d397: Pull complete 976b13f9ef1a: Pull complete 1049ed1563c4: Pull complete 74bce33f9447: Pull complete 1380d1556f92: Pull complete Digest: sha256:9de9bdc8573048d23e9458bd8448eda08409f0b7bd0888aed7606e8ea016089e Status: Downloaded newer image for tabbyml/tabby:latest Writing to new file. 🎯 Transferring https://huggingface.co/TheBloke/deepseek-coder-6.7B-base-GGUF/resolve/main/deepseek-coder-6.7b-base.Q8_0.gguf ⠈ 00:27:02 ▕███████████████████ ▏ 6.36 GiB/6.67 GiB 4.14 MiB/s ETA 76s. 🎯 Transferring https://huggingface.co/TheBloke/deepseek-coder-6.7B-base-GGUF/resolve/main/deepseek-coder-6.7b-base.Q8_0.gguf ⠚ 00:27:03 ▕███████████████████ ▏ 6.36 GiB/6.67 GiB 4.14 MiB/s ETA 76s. 🎯 Transferring https://huggingface.co/TheBloke/deepseek-coder-6.7B-base-GGUF/resolve/main/deepseek-coder-6.7b-base.Q8_0.gguf ⠤ 00:27:03 ▕███████████████████ ▏ 6.36 GiB/6.67 GiB 4.14 MiB/s ETA 76s. 🎯 Transferring https://huggingface.co/TheBloke/deepseek-coder-6.7B-base-GGUF/resolve/main/deepseek-coder-6.7b-base.Q8_0.gguf ⠉ 00:27:03 ▕███████████████████ ▏ 6.36 GiB/6.67 GiB 4.13 MiB/s ETA 76s. 🎯 Downloaded https://huggingface.co/TheBloke/deepseek-coder-6.7B-base-GGUF/resolve/main/deepseek-coder-6.7b-base.Q8_0.gguf to /data/models/TabbyML/DeepseekCoder-6.7B/ggml/q8_0.v2.gguf.tmp 00:28:21 ▕████████████████████▏ 6.67 GiB/6.67 GiB 4.02 MiB/s ETA 0s. 2024-01-07T19:30:56.587509Z INFO tabby::serve: crates/tabby/src/serve.rs:111: Starting server, this might takes a few minutes...

It then gets back to powershll prompt since the container isn't running anymore. Tested with StarCoder 1B as well, both with CUDA or CPU, always same result.

 H:\    docker run -it -p 8080:8080 -v $HOME/.tabby:/data tabbyml/tabby serve --model TabbyML/StarCoder-1B  in pwsh at 22:06:47 2024-01-07T21:06:56.819842Z INFO tabby::serve: crates/tabby/src/serve.rs:111: Starting server, this might takes a few minutes...

Command runs then it returns to the powershell like if the server gets terminated.

wsxiaoys commented 8 months ago

Hi, could you try the windows exe distribution in https://github.com/TabbyML/tabby/releases/tag/v0.7.0 ?

Mayorc1978 commented 8 months ago

Hi, could you try the windows exe distribution in https://github.com/TabbyML/tabby/releases/tag/v0.7.0 ?

Giving it a try, thanks.

Mayorc1978 commented 8 months ago

Same problems (but doesn't even download any model), get a crash (werfault.exe in Process Explorer with cuda117, but no error code in the shell), tried both versions, cuda122 version gives me an insta-crash (no time to show a werfault.exe in Process Explorer). I don't understand the reasons since ollama + Llama Coder, installed with docker ( docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama) despite no Windows exe distribution yet works very well together with the LLama Coder VSCode extension and gave me no problem.

TabbyML / tabby

The server doesn't keep itself alive on my Windows 10 docker run #1170