Open Dharin-shah opened 7 months ago
Any support & direction here would be appreciated, I can pick up the change post discussion :)
Hey @Dharin-shah, sorry for delay. I think overall something can be done, but what to do may be a little bit tricky and will require some brainstorming. I think separating the refresh interval for vector fields in the same index will be pretty tricky. I havent looked into it, but decoupling will not be straightforward based on field.
I wonder though, if something like keeping multiple graphs (based on size) in a single segment might be a way to solve the use case - where writes are heavy. For instance, instead of building one graph on merge, we could say X graphs from previous segments and search them all, while updating deletes. I dont believe this is supported in Lucene, but it is something we could potentially discuss.
The other potential idea is having a specialized read only index - in other words, build one big graph.
@navneet1v might have some ideas as well.
thanks @jmazanec15 . Maybe something around the DocValueConsumer
I am not well versed with Lucene's API, so i can have a look, for simpler yet good solution
@Dharin-shah right now as Jack is mentioning there is no solution provided by downstream system (Lucene) to just create data structures like graphs, inverted index etc for some fields.
Now one thing that we need to understand here is Lucene segments are immutable so even if we delay the graph creation but if graphs are not created by the time segments are written on the disk this will lead to more catastrophic issues.
Some suggestions from my side:
thanks @navneet1v
I would probably move towards a smaller refresh interval if you are using higher refresh interval with periodic force merge. This will ensure that a lot of documents containing the graphs are not accumulated.
The current problem is that we do quite frequent merges and segment creation, due to 30sec refresh interval. We have a write heavy workload.
here is Lucene segments are immutable so even if we delay the graph creation but if graphs are not created by the time segments are written on the disk this will lead to more catastrophic issues.
ofcourse yeah, since we use Faiss, is the hnsw graph not indexed separately? we would be fine if the graph is not reindex or the rebuilt for some time
I guess the fundamental problem is that graph reindexing takes up quite a lot of CPU resources, so perhaps addressing that might be better than complex processes
ofcourse yeah, since we use Faiss, is the hnsw graph not indexed separately?
no, its not.
we would be fine if the graph is not reindex or the rebuilt for some time
Yeah this separation cannot be done because the graph gets stored as a segment file. If want to separate the graphs from segments that will change the core fundamentals.
@Dharin-shah I was thinking about the feature you have been asking, one way I can see we might achieve something similar is by adding a setting which can say if the number of vectors is greater than a certain value in the segment then we will create the graphs otherwise we will not create the graph.
Lets say the limit is 1000 vectors. If a segment has more than 1K vectors we will create HNSW graph otherwise we will not create HNSW graph. During search if HNSW graph is present we will do the graph based search otherwise we can do exact search on the vectors which are stored. As segments will start to merge during indexing graphs HNSW graphs will be created when number of vectors threshold goes above 1K.
The only problem I can see with this, during background merges there will be spikes in CPU utilization, but it will same without this feature too. Hence I see for write heavy this can help.
Let me know your thoughts.
cc: @jmazanec15
I have hinted this idea in this GH issue: https://github.com/opensearch-project/k-NN/issues/1599 in section Create vector search data structures creation greedily
. Although the idea in the issue is very generic but I do believe we can build that feature like I suggested in this comment.
Like the idea @navneet1v. I have a question, if we are avoiding graph creation for smaller segments do you think this will be effective as I feel creating graphs for smaller segments may not be computationally expensive and overall impact might be low?
if we are avoiding graph creation for smaller segments do you think this will be effective as I feel creating graphs for smaller segments may not be computationally expensive and overall impact might be low?
@vamshin
Yes creating small graphs is computationally inexpensive, this is a case of heavy write traffic. The graphs will be created and thrown away as segments gets merged. In this case as the traffic was mainly of write I can put a higher value min number of vectors required to create graph and achieve a good write throughput. We will not be avoiding all the spikes in CPU util but atleast with this, we can increase the write throughput with occasional dip throughput when condition to create graphs is met.
Another extreme of this solution is, you completely stop the graph creation till you are indexing and then enable the graph creation during after wards and hit force merge api to recreate the segments and graphs. But that extreme is not going to work in the use case provided by @Dharin-shah. Hence I proposed some intermediate solution that still aligns with the creating graphs during force merges.
@Dharin-shah Are you trying to avoid 100% CPU usage or reduce how often the HNSW graph is constructed? Increasing the refresh time will only help with the second goal. If your goal is to avoid high CPU usage, we could focus on 1. incremental graph construction during ingestion(lucene support this) and 2. slowing down the graph construction process to prevent CPU exhaustion. For the second goal, we could reuse existing graph during merge(lucene support this)
Could you switch to lucene engine and see how it works for your case?
Gotchya, yep that makes sense, will try this out and report. Thanks @heemin32
@Dharin-shah the feature for delaying the graph creating is added in 2.18 and will be released with 2.18 . Would you be interested in testing the feature and provide your feedback.
Is your feature request related to a problem? Yes, One of our use case is we have a big document json in an index, that has different text and other fields for filtering and aggregation, opensearch is our source of truth for the data we support via API. We also have a vector field in the same index, and we index the whole document everytime there is a change in some fields in the document, which essentially causes a segment creation after a refresh and eventually a segment merge. I have seen quite a few hot threads that take 100% of CPU during lucene segment merge and also during reindexing to rebuild the hnsw graph. My proposal is we dont need to refresh the graph as often as we refresh our json that has other fields.
Why we have it in the same index is another question, but primarily we use a hybrid search with knn and bm25 and also do some pre & post-filtering and aggregations on the other json fields we have. Hence its faster since we dont have to do multiple requests to achieve the same result.
What solution would you like? Can we have a separate refresh interval for knn, which by default works as it is now, but can be configured to be refreshed periodically based on that parameter. Since reindexing the graph is essentially generating and rebuild the vertices for that node. Since we have a write heavy use case, we want to reduce the chances of that happening, and we are okay with less accurate "semantic" search for a period.
What alternatives have you considered? We have built a delta layer during writes to not write if there is no change in any fields, but considering we have a lot of fields, even a small change in one of the fields, causes it to reindex and essentially reindex the node in the graph, even if the vector hasnt changed.
Do you have any additional context? Added the logs