weaviate / weaviate-python-client

A python native client for easy interaction with a Weaviate instance.
https://weaviate.io/developers/weaviate/current/client-libraries/python.html
BSD 3-Clause "New" or "Revised" License
147 stars 64 forks source link

Batch import stuck even just one record #1083

Open richard-yao opened 1 month ago

richard-yao commented 1 month ago

Hey all, I have met a strange issue: I have a flask app to manage request and store data to weaviate db When I debug with Pycharm, all works well, but if I start server with gunicorn, the same request with same code stuck!

My server dependencies here:

flask[async]==3.0.2
qdrant-client==1.9.0
tiktoken==0.4.0
langchain==0.0.267
langchain-experimental==0.0.25
langsmith==0.0.25
openai==0.27.7
pandasai==1.5.20
pdfplumber==0.10.3
pypdf==3.9.0
streamlit>=1.24.0
streamlit-chat==0.0.2.2
anthropic==0.7.3
faiss-cpu==1.7.4
SQLAlchemy==2.0.19
mysql-connector-python==8.1.0
snowflake-id==0.0.4
prometheus-client==0.17.1
gunicorn==21.2.0
weaviate-client==4.5.5
sseclient-py==1.8.0
beautifulsoup4==4.12.2
apscheduler==3.10.4
flask-cors==4.0.0
flask-socketio==5.3.6
gevent==23.9.1
redis-py-cluster==2.1.3
dnspython==2.4.2
requests-toolbelt==1.0.0
pydantic==2.5.0
anyio==3.7.1
python-socketio==5.11.1
cachetools==5.3.2

And my python version is: Python 3.11.6

The issue code:

image

When I debug with PyCharm, it work well, the logs is:

###|||2024-05-27 15:12:13|||INFO|||d9dd00b6f8a24d33b79b06865bcf3427|||MainThread|||weaviate_storage.py[478]--->get collection name
###|||2024-05-27 15:12:13|||INFO|||d9dd00b6f8a24d33b79b06865bcf3427|||MainThread|||weaviate_storage.py[486]--->start add one data without vector
###|||2024-05-27 15:12:13|||INFO|||d9dd00b6f8a24d33b79b06865bcf3427|||MainThread|||weaviate_storage.py[488]--->complete batch add
###|||2024-05-27 15:12:13|||INFO|||d9dd00b6f8a24d33b79b06865bcf3427|||MainThread|||third_service.py[270]--->Complete save_todo_items_to_memory
###|||2024-05-27 15:12:13|||INFO|||d9dd00b6f8a24d33b79b06865bcf3427|||MainThread|||third_service.py[278]--->step into update_todo_memory
###|||2024-05-27 15:12:13|||INFO|||d9dd00b6f8a24d33b79b06865bcf3427|||MainThread|||third_service.py[172]--->end process todo memory
###|||2024-05-27 15:12:13|||INFO|||d9dd00b6f8a24d33b79b06865bcf3427|||MainThread|||third_service.py[140]--->submit async_process_todo_memory task

And when I start gunicorn server, the same request stuck with log:

###|||2024-05-27 15:15:35|||INFO|||1b35c225b33b45179936e801c7e01adc|||Dummy-3|||weaviate_storage.py[478]--->get collection name
###|||2024-05-27 15:15:35|||INFO|||1b35c225b33b45179936e801c7e01adc|||Dummy-3|||weaviate_storage.py[486]--->start add one data without vector
###|||2024-05-27 15:15:35|||INFO|||1b35c225b33b45179936e801c7e01adc|||Dummy-3|||weaviate_storage.py[488]--->complete batch add
[2024-05-27 15:17:05 +0800] [54397] [CRITICAL] WORKER TIMEOUT (pid:54403)
[2024-05-27 15:17:06 +0800] [54397] [ERROR] Worker (pid:54403) was sent SIGKILL! Perhaps out of memory?
[2024-05-27 15:17:06 +0800] [57216] [INFO] Booting worker with pid: 57216

My gunicorn server command is:

gunicorn -k gevent -w 1 -t 90 -b 0.0.0.0:80 --threads 32 --worker-connections 1024 --log-level 'debug' 'server:app'

And I also tried to upgrade weaviate-client to 4.6.0 but it is not work too.

This issue looks like the with collection.batch.dynamic() as batch: not complete because my data not stored into weaviate db, and I suspect this issue caused of gevent, the only difference between PyCharm debug and gunicorn server

tsmith023 commented 1 month ago

Hi @richard-yao, I can replicate this bug and your analysis is correct! The addition of -k gevent causes the behaviour. The problem seems to be with the batch.flush() method that is called upon exiting the batch context leading to some kind of deadlock between the asyncio event loop used by the client and gevent's implementation. I will look into it and see if there's a fix that can be made!

Edit: I've tested the bug on the implement-async-client branch, where the management of the client's internal event loop is much improved, and it doesn't exist! Once that implementation lands in the next minor release this bug should be gone!

richard-yao commented 3 weeks ago

@tsmith023 Thanks for your answer! I have using another api: collection.data.insert_many, this works normal for me, and I'm very glad to hear that you guys had fixed this problem, I'm really looking forward to the next release