danny-avila / LibreChat

Enhanced ChatGPT Clone: Features Anthropic, AWS, OpenAI, Assistants API, Azure, Groq, o1, GPT-4o, Mistral, OpenRouter, Vertex AI, Gemini, Artifacts, AI model switching, message search, langchain, DALL-E-3, ChatGPT Plugins, OpenAI Functions, Secure Multi-User System, Presets, completely open-source for self-hosting. Actively in public development.
https://librechat.ai/
MIT License
19.2k stars 3.2k forks source link

[Bug]: Hugging Face Inference Endpoint "Model not found" #2250

Closed ArmykOliva closed 7 months ago

ArmykOliva commented 7 months ago

What happened?

I am hosting my own LLM through the Hugging Face Inference endpoint and using it's openai API chat completions endpoint. I have it setup through the custom endpoint key in librechat.yaml like this:

- name: "Hugging Face notus-7b"
      apiKey: "${HF_API_KEY}"
      baseURL: "https://h6ddy7xwoxlldyt9.us-east-1.aws.endpoints.huggingface.cloud/v1"
      models:
        default: ["tgi"]

Librechat then shows an error "Model not found" when trying to use it. I also tried adding /completions to the baseURL, which also didn't fix the issue.

Here is an example CURL command that uses the inference endpoint flawlessly:

curl "https://xxxxxxx.us-east-1.aws.endpoints.huggingface.cloud/v1/chat/completions" \
-X POST \
-H "Authorization: Bearer hf_XXXXX" \
-H "Content-Type: application/json" \
-d '{
    "model": "tgi",
    "messages": [
        {
            "role": "user",
            "content": "What is deep learning?"
        }
    ],
    "stream": true,
    "max_tokens": 500
}'

Steps to Reproduce

  1. Create a hugging face inference endpoint https://ui.endpoints.huggingface.co
  2. Add the model into the custom config
  3. Try using the model

What browsers are you seeing the problem on?

No response

Relevant log output

No response

Screenshots

No response

Code of Conduct

ArmykOliva commented 7 months ago

I just noticed the issue being on track in the project management: https://github.com/users/danny-avila/projects/2/views/1?pane=issue&itemId=30601966

danny-avila commented 7 months ago

Added you on discord. This config with your setup works fine for me (aside from the gibberish). You may need to dial in temperature, etc.

librechat.yaml file:

version: 1.0.5
cache: true
endpoints:
  custom:
    - name: "Hugging Face notus-7b"
      apiKey: "${HF_API_KEY}"
      baseURL: "https://h6ddy7xwoxlldyt9.us-east-1.aws.endpoints.huggingface.cloud/v1"
      models:
        default: ["tgi"]
      dropParams: ["top_p"]

image