michaelfeil / infinity

Infinity is a high-throughput, low-latency REST API for serving text-embeddings, reranking models and clip
https://michaelfeil.github.io/infinity/
MIT License
1.23k stars 83 forks source link

get asyncio.exceptions.CancelledError error #317

Open DFanny-5 opened 1 month ago

DFanny-5 commented 1 month ago

System Info

Using CPU and kubernetes CMD: infinity_emb v1 --model-name-or-path /mnt/model-storage/ms-marco-MiniLM-L-12-v2 --port 8452 --batch-size 32

Information

Tasks

Reproduction

Run the command on local computer works, but run on a kubernetes shows the following error INFO: Application startup complete. INFO: Uvicorn running on http://0.0.0.0:8452 (Press CTRL+C to quit) ERROR: Traceback (most recent call last): File "/usr/local/lib/python3.11/contextlib.py", line 222, in aexit await self.gen.athrow(typ, value, traceback) File "/usr/local/lib/python3.11/site-packages/infinity_emb/infinity_server.py", line 82, in lifespan yield File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 741, in lifespan await receive() File "/usr/local/lib/python3.11/site-packages/uvicorn/lifespan/on.py", line 137, in receive return await self.receive_queue.get() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/asyncio/queues.py", line 158, in get await getter asyncio.exceptions.CancelledError

Expected behavior

I see INFO: Uvicorn running on http://0.0.0.0:8452 (Press CTRL+C to quit) so the server seems starts correctly? But the asyncio.exceptions.CancelledError cause the pod to crash immediately

michaelfeil commented 1 month ago

I think the server did not start correctly. It error-ed during startup. Can you try running ms-marco from huggingface on your k8s cluster?

DFanny-5 commented 1 month ago

I think the server did not start correctly. It error-ed during startup. Can you try running ms-marco from huggingface on your k8s cluster?

Thanks for the quick reply. Here is the process I tried: I get the model by: git clone https://huggingface.co/cross-encoder/ms-marco-MiniLM-L-12-v2

Then I run infinity_emb v1 --model-name-or-path /mnt/model-storage/ms-marco-MiniLM-L-12-v2 --port 8452 locally, it it seems success. but after I zip the ms-marco-MiniLM-L-12-v2 folder and upload to the persist volume and unzip it on the k8s, run the same command above get the error I mentioned above:

Run the command on local computer works, but run on a kubernetes shows the following error INFO: Application startup complete. INFO: Uvicorn running on http://0.0.0.0:8452/ (Press CTRL+C to quit) ERROR: Traceback (most recent call last): File "/usr/local/lib/python3.11/contextlib.py", line 222, in aexit await self.gen.athrow(typ, value, traceback) File "/usr/local/lib/python3.11/site-packages/infinity_emb/infinity_server.py", line 82, in lifespan yield File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 741, in lifespan await receive() File "/usr/local/lib/python3.11/site-packages/uvicorn/lifespan/on.py", line 137, in receive return await self.receive_queue.get() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/asyncio/queues.py", line 158, in get await getter asyncio.exceptions.CancelledError

DFanny-5 commented 1 month ago

I think the uploading is not wrong because the same process works for the ms-marco-TinyBERT-L-2 model

michaelfeil commented 1 month ago

The message is an ungraceful exit. Potentially, because a new replica is rolled out, or not enough resources, or model is saved incorrectly.