How to serve local models with python package (not docker) - Githubissues

huggingface / text-generation-inference

Large Language Model Text Generation Inference

http://hf.co/docs/text-generation-inference

Apache License 2.0

8.81k stars 1.03k forks source link

How to serve local models with python package (not docker) #2541

Open hahmad2008 opened 3 days ago

hahmad2008 commented 3 days ago

System Info

pip install text-generationwith version '0.6.0' I need to use python package not docker

Information

[ ] Docker
[ ] The CLI directly

Tasks

[ ] An officially supported command
[ ] My own modifications

Reproduction

from text_generation import Client

# Initialize the client
client = Client("/path/to/model/locally")

# Generate text
response = client.generate("Your input text here")

error:

MissingSchema: Invalid URL '/path/to/model/locally': No scheme supplied. Perhaps you meant [/path/to/model/locally](/path/to/model/locally?

also I tried this as with some models also on huggingface and local models doesn't work!

from text_generation import InferenceAPIClient
client = InferenceAPIClient("NousResearch/Meta-Llama-3.1-8B-Instruct")
text = client.generate("Why is the sky blue?").generated_text
print(text)
# ' Rayleigh scattering'

# Token Streaming
text = ""
for response in client.generate_stream("Why is the sky blue?"):
    if not response.token.special:
        text += response.token.text

print(text)

error:

NotSupportedError: Model `NousResearch/Meta-Llama-3.1-8B-Instruct` is not available for inference with this client. 
Use `huggingface_hub.inference_api.InferenceApi` instead.

Expected behavior

I can load any model ( local or form HF hub)

nbroad1881 commented 11 hours ago

There is no pip installable package for TGI at the moment. Use vLLM if that is what you need.