Closed camilleborrett closed 6 months ago
It now suddenly works, while it didn't work during the entire day. Closing this issue.
I updated the backend to 1.4.5. We had a bug with batching grammar requests in 1.4.3.
However there is still something weird at play here as the model seems to only partially follow the grammar. I pinged someone internally to look into it.
System Info
Model: mistralai/Mixtral-8x7B-Instruct-v0.1 Accessed through: https://api-inference.huggingface.co/models/mistralai/Mixtral-8x7B-Instruct-v0.1
Information
Tasks
Reproduction
Expected behavior
I copy/pasted the "Constrain with Pydantic" code from this link: https://huggingface.co/docs/text-generation-inference/conceptual/guidance#constrain-with-pydantic to test using guidance with the 'mistralai/Mixtral-8x7B-Instruct-v0.1' model.
I would like to test my use case using the HF serverless inference API before launching my own endpoint. However, I get the following error message: {'error': 'Request failed during generation: Server error: ', 'error_type': 'generation'}. Is guidance not supported on the API inference? (It does work without guidance)
My understanding is that the HF API is running on version 1.4.3., which should be compatible.
The only modification I made to the code in the example link is adding
"Authorization": f"Bearer {huggingface_hub.get_token()}"
to the headers and using https://api-inference.huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1 as the URL.