Closed martj001 closed 5 months ago
I also am having this issue. Please fix!
Hello, I'm also having this issue. I need to run an inference server locally. I should require to use Hugging Face API Token, since it's my local TGI.
Hi. Let me add that this issue also persists when using a token that is issued by HuggingFace's oath service. https://www.gradio.app/guides/sharing-your-app#o-auth-login-via-hugging-face
That's because you can use these tokens for accessing inference API, but not for logging in. But token validity is checked by trying to login to hub, as detailed above.
As a workaround I subclassed HuggingFaceEndpoint:
from langchain_community.llms.huggingface_endpoint import HuggingFaceEndpoint
from langchain_core.pydantic_v1 import root_validator
from langchain_core.utils import get_from_dict_or_env
class LazyHuggingFaceEndpoint(HuggingFaceEndpoint):
"""LazyHuggingFaceEndpoint"""
# We're using a lazy endpoint to avoid logging in with hf_token,
# which might in fact be a hf_oauth token that does only permit inference,
# not logging in.
@root_validator()
def validate_environment(cls, values: Dict) -> Dict:
"""Validate that package is installed; SKIP API token validation."""
try:
from huggingface_hub import AsyncInferenceClient, InferenceClient
except ImportError:
msg = (
"Could not import huggingface_hub python package. "
"Please install it with `pip install huggingface_hub`."
)
raise ImportError(msg) # noqa: B904
huggingfacehub_api_token = get_from_dict_or_env(
values, "huggingfacehub_api_token", "HUGGINGFACEHUB_API_TOKEN"
)
values["client"] = InferenceClient(
model=values["model"],
timeout=values["timeout"],
token=huggingfacehub_api_token,
**values["server_kwargs"],
)
values["async_client"] = AsyncInferenceClient(
model=values["model"],
timeout=values["timeout"],
token=huggingfacehub_api_token,
**values["server_kwargs"],
)
return values
Might also help here: https://github.com/langchain-ai/langchain/issues/19685
experiencing the same, I'd like to be able to talk to a local TDI that has no auth, but I can't
This is a showstopper for me. It would be nice here an acknowledgment of the issue. I'm sorry I don't have a pull request.
i've tried commenting out the offending block per @martj001 original comment.
Here is my setup
model=meta-llama/Meta-Llama-3-8B-Instruct
docker run --gpus all --shm-size 1g \
-p 8080:80 -v $volume:/data \
-e HUGGING_FACE_HUB_TOKEN=$token \
ghcr.io/huggingface/text-generation-inference:2.0.2 \
--model-id $model --quantize bitsandbytes-fp4 \
--max-input-length 8000 --max-total-tokens 8192
I can verify its accessible
I have chat-ui
running on the same VM with the following model config:
# 'name', 'userMessageToken', 'assistantMessageToken' are required
MODELS=`[
{
"name": "meta-llama/Meta-Llama-3-8B-Instruct",
"displayName": "meta-llama/Meta-Llama-3-8B-Instruct",
"description": "meta-llama/Meta-Llama-3-8B-Instruct",
"multimodal" : false,
"websiteUrl": "https://meta.ai",
"userMessageToken": "",
"userMessageEndToken": "<|eot_id|>",
"chatPromptTemplate": "<|begin_of_text|><|start_header_id|>system<|end_header_id|>{{@root.preprompt}}<|eot_id|>{{#each messages}}{{#ifUser}}<|start_header_id|>user<|end_header_id|>{{content}}<|eot_id|>{{/ifUser}}{{#ifAssistant}}<|start_header_id|>assistant<|end_header_id|>{{content}}<|eot_id|>{{/ifAssistant}}{{/each}}<|start_header_id|>assistant<|end_header_id|>",
"parameters": {
"temperature": 0.9,
"top_p": 0.95,
"top_k": 50,
"truncate": 4096,
"max_new_tokens": 4096,
"stop": [
"<|start_header_id|>",
"<|end_header_id|>",
"<|eot_id|>"
]
},
"endpoints": [
{
"type": "tgi",
"url": "http://127.0.0.1:8080"
}
]
}
]`
My remote chat-ui
works fine @ http://
On another remote machine I'm trying to run a langchain chain, under chainlit
. The only way I can get this to partially work is by commenting out the following in lanchain_community/llms/huggingface_endpoint.py
@root_validator()
def validate_environment(cls, values: Dict) -> Dict:
"""Validate that package is installed and that the API token is valid."""
#try:
# from huggingface_hub import login
#except ImportError:
# raise ImportError(
# "Could not import huggingface_hub python package. "
# "Please install it with `pip install huggingface_hub`."
# )
#try:
#huggingfacehub_api_token = get_from_dict_or_env(
# values, "huggingfacehub_api_token", "HUGGINGFACEHUB_API_TOKEN"
#)
#login(token=huggingfacehub_api_token)
#except Exception as e:
# raise ValueError(
# "Could not authenticate with huggingface_hub. "
# "Please check your API token."
# ) from e
...
I construct it like this: (HuggingFaceEndpoint
I can't get to work)
HuggingFaceTextGenInference(
inference_server_url="http://<tgi.vm.ip>:8080",
max_new_tokens=256,
top_k=10,
top_p=0.95,
typical_p=0.95,
temperature=0.8,
repetition_penalty=1.03,
streaming=True,
timeout=30
)
This sort of works, but I constantly get timeouts and non-responses. Its hard to tell whats going on with this without better logging.
Checked other resources
Example Code
Error Message and Stack Trace (if applicable)
Description
Background
While restructuring our codebase in response to the deprecation of
HuggingFaceTextGenInference
, I encountered an error when attempting to create aHuggingFaceEndpoint
with a locally hosted TGI server.Issue
The error occurs in the
validate_environment function
of thehuggingface_endpoint.py
file, specifically in the lines 170-179.The
@root_validator()
decorator throws an error whenhuggingfacehub_api_token
is passed asNone
, which happens due tologin(token=huggingfacehub_api_token)
invalidate_environment
function. By commenting out the block that processes the API token and manually settinghuggingfacehub_api_token
toNone
, I am able to successfully create anInferenceClient
.Since HuggingFaceTextGenInference is fused into HuggingFaceEndpoint in PR #17254, we need to add logic to handle cases where
huggingfacehub_api_token
is passed asNone
or when no environment variableHUGGINGFACEHUB_API_TOKEN
is set. This is particularly necessary for setups using a locally hosted TGI server where authentication with the Huggingface Hub may not be required.System Info
huggingface-hub==0.22.2 langchain-commnity==0.0.32
platform: linux python version: 3.10