Open MoritzLaurer opened 3 weeks ago
Hi @MoritzLaurer , thanks for reporting. It should work if you pass the URL as base_url
. This is due to how URLs are treated. In this case, /v1/chat/completions
must be appended to the model
. I'll see what I can do to fix it.
In the meantime you can just do:
client = InferenceClient(base_url=API_URL)
or
client = InferenceClient(API_URL + "/v1/chat/completions")
I wanted raise a related issue, I posted in langchain previously but I got no response so far from maintainers. I believe this is a huggingface-hub issue, so I'm posting it here again: https://github.com/langchain-ai/langchain/issues/24720
I got the same 422 error when using the ChatHuggingFace and HuggingFaceEndpoint APIs in langchain-huggingface. I got this error when using huggingface-hub=0.24.6, but no such error when I downgrade to 0.24.0
I wanted raise a related issue, I posted in langchain previously but I got no response so far from maintainers. I believe this is a huggingface-hub issue, so I'm posting it here again: langchain-ai/langchain#24720
I got the same 422 error when using the ChatHuggingFace and HuggingFaceEndpoint APIs in langchain-huggingface. I got this error when using huggingface-hub=0.24.6, but no such error when I downgrade to 0.24.0
Also: https://github.com/langchain-ai/langchain/discussions/25675
There also seems to be a bind issue with parameters, but most likely a ChatHuggingFace
specific issue: https://github.com/langchain-ai/langchain/issues/23586#issuecomment-2304988796
And my nightmare with dealing with HuggingFace dedicated endpoints: https://github.com/langchain-ai/langchain/discussions/25675#discussioncomment-10423708 (It appears the JSON schema is sent as a tool option?) The endpoint
generates the tool any
error, not the client code.
Even i am getting the same error even using this client = InferenceClient(base_url=API_URL) or client = InferenceClient(API_URL + "/v1/chat/completions")
error: raise HfHubHTTPError(str(e), response=response) from e huggingface_hub.utils._errors.HfHubHTTPError: 422 Client Error: Unprocessable Entity for url: https://api-inference.huggingface.co/models/tiiuae/falcon-7b-instruct/v1/chat/completions (Request ID: gjZMzbcCD6oWDQFV4HIXe) code:
from huggingface_hub import InferenceClient
API_URL=https://api-inference.huggingface.co/models/tiiuae/falcon-7b-instruct
client = InferenceClient(API_URL + "/v1/chat/completions")
output = client.chat_completion([{"role": "user", "content": "ok"}], response_format={"type": "json_object"})
print(output.choices[0].message.content)
Hi @Saisri534, thanks for reporting this! Looks like you're running into a different bug than what this issue is about. Could you open up a new issue for that? 😄
Hi @hanouticelina , The issue i am facing may be due to the "tiiuae/falcon-7b-instruct" But the code is working fine for "mistralai/Mixtral-8x7B-Instruct-v0.1" code: `from huggingface_hub import InferenceClient client = InferenceClient("mistralai/Mixtral-8x7B-Instruct-v0.1")
messages = [ { "role": "user", "content": "I saw a puppy a cat and a raccoon during my bike ride in the park. What did I saw and when?", }, ]
response_format = { "type": "json", "value": { "properties": { "location": {"type": "string"}, "activity": {"type": "string"}, "animals_seen": {"type": "integer", "minimum": 1, "maximum": 5}, "animals": {"type": "array", "items": {"type": "string"}}, }, "required": ["location", "activity", "animals_seen", "animals"], }, }
response = client.chat_completion( messages=messages, response_format=response_format, max_tokens=500, )
print(response.choices[0].message.content)`
output while using mistral: { "activity": "saw", "animals": ["puppy", "cat", "raccoon"], "animals_seen": 3, "location": "in the park during a bike ride" }
error while using Falcon: raise HfHubHTTPError(str(e), response=response) from e huggingface_hub.utils._errors.HfHubHTTPError: 422 Client Error: Unprocessable Entity for url: https://api-inference.huggingface.co/models/tiiuae/falcon-7b-instruct/v1/chat/completions (Request ID: hfRdP0ueGjO47DHM7ENhn)
@Saisri534 can you summarize this in a new separate issue please? Thanks! https://github.com/huggingface/huggingface_hub/issues/new
I wanted raise a related issue, I posted in langchain previously but I got no response so far from maintainers. I believe this is a huggingface-hub issue, so I'm posting it here again: langchain-ai/langchain#24720
I got the same 422 error when using the ChatHuggingFace and HuggingFaceEndpoint APIs in langchain-huggingface. I got this error when using huggingface-hub=0.24.6, but no such error when I downgrade to 0.24.0
@Wauplin @hanouticelina Could you please help look into the issue that I described? I got this error when using huggingface-hub=0.24.6, but no such error when I downgrade to 0.24.0. Thanks!
Describe the bug
I could previously use the following code to the inference client and it worked (e.g. in this cookbook recipe for the hf endpoints)
This code now results in this error: (Additional observation: if the endpoint is scaled to zero, then the code first works, by making the endpoint start up again, but then once the endpoint is started up, the error is thrown)
I still get correct outputs via HTTP requests, so it doesn't seem to be an issue with the endpoint or my token
Reproduction
No response
Logs
No response
System info