Closed Satyadev592 closed 1 year ago
@Satyadev592 sounds like a known issue about compaction and handoff in v2.2.3, any chance to run a test on v2.2.8? if it reproduced, could you please attach all the milvus pods for investigation? /assign @Satyadev592
we have a similar test scenario in house with latest build: every day we do insert and delete with almost same counts to make the total rows not changed a lot. We can see that the memory of query node keeps stable
keeps stable
try to trigger on compaction and see what's happening?
Using 2.2.8 might be a better choice
@Satyadev592 we have release v2.2.9, any chance to run a test on the latest release?
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Rotten issues close after 30d of inactivity. Reopen the issue with /reopen
.
We upgraded to 2.2.9 and thought this issue is behind us but unfortunately it was short lived. The issue where memory util spikes up randomly presented itself yet again. Attaching some graphs for reference.
The memory spike seems to be correlated with DDL requests showpartitions and describecollection.
@Satyadev592 Could you please attach the etcd backup for investigation? Check this: https://github.com/milvus-io/birdwatcher for details about how to backup etcd with birdwatcher
Untitled.txt Here's the backup as requested.
@Satyadev592 could you try to manual compact and check if memory descrease?
Is there an existing issue for this?
Environment
Current Behavior
We are noticing an increase in query nodes RAM utilisation over time. Our day level operations can be classified as follows: 300K inserts 300K deletes 1M updates (delete followed by insert) Technically our num_entities is not changing over time but our query node utilisation is increasing over time. Here's a graph illustrating our issue:
We have tried manual compaction every day to see if that solves the issue - but it does not. We are currently doing rolling restarts of the query nodes which gets the query nodes back to 35GB usage.
Expected Behavior
The memory utilisation for the query nodes should plateau for our use case as the number of entities/vectors is not increasing.
Steps To Reproduce
No response
Milvus Log
No response
Anything else?
No response