Closed ejdibs closed 3 years ago
Hello,
First, I am wondering why the inconsistency occurred. Do you have any ideas about that. If not, could you answer the following questions?
What is your OS? Did you install your ngtpy from PYPI, or install it with setup.py? Did you build and install NGT (not ngtpy) with a shared memory option? Did you reconstruct ONNG from a default index with the optimizer? Did you insert and remove by multiple threads without locking? Could you run the below command to check your index and send the result? ngt info -m a [index-folder]
We use NGT for our company's services to build an index for more than 1 M objects and continuously update the index. However, inconsistency has not occurred except the beginning of the service. At this moment, there is no functions to fix inconsistency of indexes. Since I understand the demand, I will consider implementing such functions.
Hello,
Thank you for your response. I believe that I was able to find the cause of the inconsistency.
I was creating a tar.gz of the index and backing it up after executing these methods
index.build_index()
index.save()
Running the aforementioned sequence of methods, before creating an archive of the index, was a reliable way to create an inconsistent index.
I reread the ngtpy API and reviewed some of the source code and noticed the close() method.
I updated my logic to close the index, before I attempt to archive the index directory.
index.build_index()
index.save()
index.close()
I have found that closing the index before creating an archive prevents the inconsistency.
I am now able to load and modify archived indexes without errors.
Thank you for taking the time to communicate with me. If you have any other questions, please let me know.
Regards, Erik
Hello, If you use NGT built with the shared memory option, close is needed to achieve. However, if you use NGT without the shared memory option, I think that no close just causes memory leaks. Which one do you use?
I build NGT like this:
cd NGT-1.9.1 \
&& mkdir build \
&& cd build \
&& cmake .. \
&& make \
&& make install \
&& ldconfig /usr/local/lib
I believe that this excludes the shared memory option.
For now closing the index before archiving data is fine for my use case. But if this is useful information and you have additional questions, please let me know.
Thank you for your information. It seems you do not use shared NGT.
Although I tried various things, I was not able to reproduce the bug. If you have time, could you reproduce an index (an archive) from which you cannot remove objects and send it?
BTW, after only removing objects, you do not need to call build_index. remove() updates indexes as well.
Hello,
I would like to ask a question about removing nodes from an existing index.
When I remove nodes from my index I receive errors like this:
Graph::removeEdgeReliably: Lost conectivity! Isn't this ANNG? ID=2 anyway continue...
and also
/NGT-1.9.1/lib/NGT/Index.h:1622: remove:: cannot remove from tree. id=5 /NGT-1.9.1/lib/NGT/Tree.h:191: VpTree::remove: Inner error. Cannot remove object. leafNode=1693:/NGT-1.9.1/lib/NGT/Node.cpp:260: VpTree::Leaf::remove: Cannot find the specified object. ID=5,0 idx=21 If the same objects were inserted into the index, ignore this message
After removing a subset of nodes from an existing index, will running
index.build_index()
restore connectivity?If I receive failures when attempting to remove a node, do I need to build a new index using a dataset that excludes the node(s)?
An example of my current usage could be:
My current test index is 65 thousand nodes. My use case is to create an index of 3-5 millions nodes, periodically remove a small subset of nodes and add new nodes. I am hoping to avoid inserting millions of nodes, rebuilding a fresh index every time.
Issues that cite adding and removing nodes support:
Thank you for your time and consideration. Please let me know if there is any additional information that would help address my question.
Regards, Erik