Open liltimtim opened 1 week ago
Hi @liltimtim 👋
Unfortunately, I wasn't able to reproduce this on my machine. When send 5000 requests every 1ms or so I get the "error":"Model is overloaded"
, but not the panic.
Have you gotten this error on later versions as well? Like text-embeddings-inference:cpu-1.3
?
System Info
Sample Docker Compose File
When hitting endpoint
/embed
over and over with the following dataLeads to the following issue sometimes but will always lead in the container halting
Occasionally this error will happen (although it is harder to trigger)
Why does running multiple embed requests rapidly cause a consistent crash in Docker?
Information
Tasks
Reproduction
Step 1:
/embed
with a valid JSON body and a sample text such as the following; ensure docker is running and container is running image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.0Sample body
Step 2:
Expected behavior
Server should not crash or hang when receiving rapid requests.