[Bug]: Hugging Face Inference Endpoint "Model not found"

ArmykOliva commented 7 months ago

What happened?

I am hosting my own LLM through the Hugging Face Inference endpoint and using it's openai API chat completions endpoint. I have it setup through the custom endpoint key in librechat.yaml like this:

- name: "Hugging Face notus-7b"
      apiKey: "${HF_API_KEY}"
      baseURL: "https://h6ddy7xwoxlldyt9.us-east-1.aws.endpoints.huggingface.cloud/v1"
      models:
        default: ["tgi"]

Librechat then shows an error "Model not found" when trying to use it. I also tried adding /completions to the baseURL, which also didn't fix the issue.

Here is an example CURL command that uses the inference endpoint flawlessly:

curl "https://xxxxxxx.us-east-1.aws.endpoints.huggingface.cloud/v1/chat/completions" \
-X POST \
-H "Authorization: Bearer hf_XXXXX" \
-H "Content-Type: application/json" \
-d '{
    "model": "tgi",
    "messages": [
        {
            "role": "user",
            "content": "What is deep learning?"
        }
    ],
    "stream": true,
    "max_tokens": 500
}'

Steps to Reproduce

Create a hugging face inference endpoint https://ui.endpoints.huggingface.co
Add the model into the custom config
Try using the model

What browsers are you seeing the problem on?

No response

Relevant log output

No response

Screenshots

No response

Code of Conduct

[X] I agree to follow this project's Code of Conduct

ArmykOliva commented 7 months ago

I just noticed the issue being on track in the project management: https://github.com/users/danny-avila/projects/2/views/1?pane=issue&itemId=30601966

danny-avila commented 7 months ago

Added you on discord. This config with your setup works fine for me (aside from the gibberish). You may need to dial in temperature, etc.

librechat.yaml file:

version: 1.0.5
cache: true
endpoints:
  custom:
    - name: "Hugging Face notus-7b"
      apiKey: "${HF_API_KEY}"
      baseURL: "https://h6ddy7xwoxlldyt9.us-east-1.aws.endpoints.huggingface.cloud/v1"
      models:
        default: ["tgi"]
      dropParams: ["top_p"]

danny-avila / LibreChat