Help: Starting - Githubissues

aemonge commented 7 months ago

Is there a tutorial on how to start the server with a downloaded model from hugging face?

So far, I have figured out how to build:

cargo build -r

Thought I have downloaded the WizardCoder-Python-34B-V1.0 but can't start the server. Ive tried:

❯ ./target/release/llm-ls /vault/models/WizardCoder/

Content-Length: 75

{"jsonrpc":"2.0","error":{"code":-32700,"message":"Parse error"},"id":null}%

❯ ./target/release/llm-ls /vault/models/WizardCoder/WizardCoder-Python-34B-V1.0.bin

Content-Length: 75

{"jsonrpc":"2.0","error":{"code":-32700,"message":"Parse error"},"id":null}%

McPatate commented 7 months ago

@aemonge llm-ls does not run the model for you, it only sends request to a backend API running a model. I'd suggest looking at https://github.com/huggingface/text-generation-inference, either on your computer or on a remote host. You can also use https://github.com/mlc-ai/mlc-llm IIRC. Finally, https://ollama.ai/ should be compatible to llm-ls pretty soon.

aemonge commented 7 months ago

I've never heard about ollama, but seams to be what I was looking for <3

Is there an issue liken to the ollama integration, so that I can be notified when it's on ?

Moreover, thank you very much for the support and guides

McPatate commented 7 months ago

https://github.com/huggingface/llm-ls/pull/40

No problem!

huggingface / llm-ls

Help: Starting #53