model_path is wrong at supervisor.rs on windows

neo-fetch commented 3 weeks ago

Describe the bug On running the CPU build assets(./tabby) of the latest release, it runs the llama-server, although the model path given by it when used in supervisor.rs returns an error saying the ggml file is not found.

This is because the path sent to supervisor.rs seems wrong.

The application panicked (crashed).
Message:  Failed to start llama-server with command Command { std: "\\Downloads\\tabby\\dist\\tabby_x86_64-windows-msvc\\llama-server.exe" "-m" "\\.tabby\\models\\TabbyML\\Nomic-Embed-Text\\ggml/model.gguf" "--cont-batching" "--port" "30889" "-np" "1" "--log-disable" "--ctx-size" "4096" "-ngl" "9999" "--embedding" "--ubatch-size" "4096", kill_on_drop: true }: The system cannot find the file specified. (os error 2)

Location: crates\llama-cpp-server\src\supervisor.rs:74

\ggml/model.gguf is the issue. The server runs just fine if we fix the path

Information about your version tabby 0.12.0

Information about your GPU CPU only

Additional context The setup is running on Windows 10, inside git bash. I will try running this on powershell and see if anything changes..

wsxiaoys commented 3 weeks ago

Thanks for reporting the issue, as a short term workaround, please start llama-server with fixed path and configure embedding with https://tabby.tabbyml.com/docs/administration/context/#using-a-remote-embedding-model-provider using llama.cpp/embedding backend.

neo-fetch commented 3 weeks ago

Thanks for reporting the issue, as a short term workaround, please start llama-server with fixed path and configure embedding with https://tabby.tabbyml.com/docs/administration/context/#using-a-remote-embedding-model-provider using llama.cpp/embedding backend.

Thanks @wsxiaoys!

So something like this, I presume?

[model.embedding.http]
kind = "llama.cpp/embedding"
model_name = "Nomic-Embed-Text"

Then run ./llama-server with appropriate path,

And then finally run tabby?

wsxiaoys commented 3 weeks ago

Run llama-server with following flags (need to convert to windows style path segments):

llama-server -m ~/.tabby/models/TabbyML/Nomic-Embed-Text/ggml/model.gguf --cont-batching --port 30888 -np 1 --log-disable --ctx-size 4096 --embedding --ubatch-size 4096

And configure the remote access as below in ~/.tabby/config.toml

[model.embedding.http]
kind = "llama.cpp/embedding"
api_endpoint = "http://127.0.0.1:30888"

wsxiaoys commented 1 week ago

Released in 0.13

TabbyML / tabby

model_path is wrong at supervisor.rs on windows #2394