huggingface / huggingface_hub

The official Python client for the Huggingface Hub.
https://huggingface.co/docs/huggingface_hub
Apache License 2.0
2.12k stars 556 forks source link

Gateway Timeout for mistralai/Mixtral-8x7B-Instruct-v0.1 #2459

Closed noobjam closed 3 months ago

noobjam commented 3 months ago

Describe the bug

I encountered a 504 Gateway Timeout error when attempting to use the mistralai/Mixtral-8x7B-Instruct-v0.1 model via the Hugging Face Inference API. The error occurred during model invocation and prevented successful completion of the request.

Reproduction

`from langchain.llms import HuggingFaceEndpoint from langchain.prompts import PromptTemplate from langchain.chains import LLMChain

Initialize the HuggingFaceEndpoint

llm = HuggingFaceEndpoint( repo_id="mistralai/Mixtral-8x7B-Instruct-v0.1", max_new_tokens=300, top_k=50, top_p=0.99,
temperature=0.01,
huggingfacehub_api_token="xxxxxxxxxxxxxxxxxxxxxxx" # API token )

prompt = PromptTemplate(template="Translate the following text to French: {text}", input_variables=["text"])

chain = LLMChain(llm=llm, prompt=prompt)

result = chain.run({"text": "Hello, how are you?"})

print(result)`

Logs

No response

System info

- huggingface_hub version: 0.24.5
- Platform: Windows-10-10.0.22631-SP0
- Python version: 3.11.1
- Running in iPython ?: No
- Running in notebook ?: No
- Running in Google Colab ?: No
- Token path ?: C:\Users\elQajjam Mohammed\.cache\huggingface\token
- Has saved token ?: True
- Who am I ?: JamLoad
- Configured git credential helpers: manager
- FastAI: N/A
- Tensorflow: N/A
- Torch: N/A
- Jinja2: N/A
- Graphviz: N/A
- keras: N/A
- Pydot: N/A
- Pillow: N/A
- hf_transfer: N/A
- gradio: N/A
- tensorboard: N/A
- numpy: 1.26.4
- pydantic: 2.8.2
- aiohttp: 3.10.1
- ENDPOINT: https://huggingface.co
- HF_HUB_CACHE: C:\Users\elQajjam Mohammed\.cache\huggingface\hub
- HF_ASSETS_CACHE: C:\Users\elQajjam Mohammed\.cache\huggingface\assets
- HF_TOKEN_PATH: C:\Users\elQajjam Mohammed\.cache\huggingface\token
- HF_HUB_OFFLINE: False
- HF_HUB_DISABLE_TELEMETRY: False
- HF_HUB_DISABLE_PROGRESS_BARS: None
- HF_HUB_DISABLE_SYMLINKS_WARNING: False
- HF_HUB_DISABLE_EXPERIMENTAL_WARNING: False
- HF_HUB_DISABLE_IMPLICIT_TOKEN: False
- HF_HUB_ENABLE_HF_TRANSFER: False
- HF_HUB_ETAG_TIMEOUT: 10
- HF_HUB_DOWNLOAD_TIMEOUT: 10
Wauplin commented 3 months ago

Hi @noobjam, sorry for the inconvenience. Model mistralai/Mixtral-8x7B-Instruct-v0.1 is not available on our Inference API. Agree the error message should be more explicit than the current HTTP 504 you are getting. We are working on fixing this. In the meantime, I advice you to use either https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct or https://huggingface.co/meta-llama/Meta-Llama-3.1-70B-Instruct for your use case. Otherwise you can deploy mistralai/Mixtral-8x7B-Instruct-v0.1 as a dedicated Inference Endpoints but that would require you to pay for the server.