Open scchess opened 7 months ago
Same problem here! It seems that the chat ui client doesn't support "block responses" and the hugging face inference endpoints don't support "streaming responses" by default (at least on their handler.py template). There is some vague chatter about creating a custom endpoint handler to be able to stream responses but I'm not finding how to do that. This is a big gap in my opinion.
I want to deploy a few open source models with the chat UI. I started a simple model with:
And then simply added this new endpoint to my chat UI deployment. While my deployment can connect to the model server, I'm getting an error:
How should I disable streaming for the chat UI?