yahoojapan / NGT

Nearest Neighbor Search with Neighborhood Graph and Tree for High-dimensional Data
Apache License 2.0
1.24k stars 114 forks source link

Shared Memory and reconstruct-graph #25

Closed jpleet closed 5 years ago

jpleet commented 5 years ago

When building with cmake -DNGT_SHARED_MEMORY_ALLOCATOR=ON .., the reconstruct-graph command doesn't get built.

Running: ngt reconstruct-graph Returns: ngt::reconstructGraph: Not implemented Aborted (core dumped)

Can the reconstruct-graph command be run on an shared memory ANNG? Thanks! This is a cool system

jpleet commented 5 years ago

Just saw line 674 in Command.cpp. Is there a plan to implement reconstruct-graph for shared memory? Is it even possible? Thanks

masajiro commented 5 years ago

There is no plan to implement it at the moment. Since I would like to know how it is necessary, could you tell me why you use NGT for shared memory. Your information will be helpful for future development.

jpleet commented 5 years ago

I'm using the NGT build with shared memory to create memmapped indices that are larger than RAM. Time to query from an SSD is reasonable for my needs. I was first creating an ANNG on-disk and then wanted to call reconstruct-graph to optimize the ANNG on-disk, but that doesn't work. I noticed that there are now more than 3 graph_types (a,k,b) and I can create an ONNG (o). I think it's working. If this makes sense, please close the issue. Thanks!

masajiro commented 5 years ago

Unfortunately, the graph that the construction mode (o) creates is not so optimized compared to the graph with the reconstruction. I will consider the reconstruction for shared memory. Another option is that after you make ONNG with NGT of not shared memory and export it, you can import it with NGT of shared memory. The export and import commands are not mentioned in the README.

ngt export index exported-files(directory)
ngt import index imported-files(directory)
jpleet commented 5 years ago

Hey, any new thoughts about implementing an optimized ONNG construction for shared memory? If NGT did have the feature, you would have an ANN algorithm that:

I don't think any ANN algo could possibly do all this, besides NGT. You could have a very powerful tool.

masajiro commented 5 years ago

Thank you for your helpful comment.

Since SSDs are becoming cheaper and faster, I think that reading objects on the SSD does not increase the search time so much. Although I understand that this feature is very competitive with other methods and especially useful for real applications, I have no time to implement it at this moment. However, I will implement it in the near future. In addition, even if you do not use the graph reconstruction, I think that the performance is good enough for applications.

masajiro commented 5 years ago

I implemented graph reconstruction for shared memory. Since it is not tested sufficiently, please let me know if you have any problem.