stepstone-tech / hnswlib-jna

Native-Like Performance for Nearest Neighbor Search in Java Applications using Hnswlib and Java Native Access
Apache License 2.0
32 stars 8 forks source link

how to assign parameters? #3

Closed ashfaq92 closed 4 years ago

ashfaq92 commented 4 years ago

As you know, hnsw uses ef, efConstruction and M paramters for graph construction and nearest neighbor search. I can not see in your given code examples how to assign values of these parameters. Please explainn

hussamaa commented 4 years ago

Hi @ashfaq92,

When initialize(int maxNumberOfElements) is called, we use the default ef, efConsruction and M defined in Hnswlib but there is also an overloaded version which allows you to customize these parameters.

/**
 * Initialize the index to be used.
 *
 * @param maxNumberOfElements;
 * @param m;
 * @param efConstruction;
 * @param randomSeed .
 *
 * @throws IndexAlreadyInitializedException when a index reference was initialized before.
 * @throws UnexpectedNativeException when something unexpected happened in the native side.
 */
public void initialize(int maxNumberOfElements, int m, int efConstruction, int randomSeed) {
    ...
}

https://github.com/stepstone-tech/hnswlib-jna/blob/master/hnswlib-jna/src/main/java/com/stepstone/search/hnswlib/jna/Index.java

Thanks for your remark. I'll include that on our examples. Please let me know if you have any issues. Have a great weekend.

BR, Hussama

ashfaq92 commented 4 years ago

The example that you have mentioned, only uses efConstruction and M. Still, I could not find where to specify the value of ef. The randomSeed that is the last parameter of initialize method, is ef ??

hussamaa commented 4 years ago

Hi @ashfaq92,

The last parameter of the function initialize is randomSeed indeed. We haven't exposed the setEf in our bindings. I'll expose it for you and release in the upcoming version. Would it be fine for you?

BR, Hussama

ashfaq92 commented 4 years ago

Definitely. Thanks for prompt replies and help. What is the default ef it is using right now? I really want to change ef values and record performance of method. ef has a significant impact on recall, accuracy and performance. I am successful to setup your code for research project. Initial experiments showed that this your implementation of HNSW is 74% faster than KDTrees in my research area. However, another pure java impelementation of HNSW was 90% faster. So the performance of your project may be a little slow due to ef parameter. In other HNSW implementation, I set ef to 2. I think your project uses a high value of ef.

hussamaa commented 4 years ago

Hi @ashfaq92,

Thanks you for the feedback and questions. I checked the native code and ef is initialized with value 10.

another pure java impelementation of HNSW was 90% faster.

Wow! Which library did you try out? We had a look at https://github.com/jelmerk/hnswlib some time ago the index building was quite slow for 5M+ items with 50+ dimension.

BR, Hussama

hussamaa commented 4 years ago

About the new version, would it be fine for you to have a new version for your results? I'd include the setEf() function and the changes you previously suggested. I was supposed to work on it today but I got busy with some other stuff (and the release on maven public) :D.

I could prepare everything for you by Thursday / Friday (max).

ashfaq92 commented 4 years ago

yeah sure. I also used the above mentioned library. Initial experiments were on 2 dimensions and 5000 vectors (euclidean metric space). So Once I go in larger dimensions and huge size of vectors, the performance difference may seem obvious.

Yeah, without ef, the experiments cant be made. Thanks again for dedication and prompt support

Get Outlook for Androidhttps://aka.ms/ghei36


From: Hussama Ismail notifications@github.com Sent: Wednesday, July 1, 2020 12:13:47 AM To: stepstone-tech/hnswlib-jna hnswlib-jna@noreply.github.com Cc: Muhammad Ashfaq ashfaq92@outlook.com; Mention mention@noreply.github.com Subject: Re: [stepstone-tech/hnswlib-jna] how to assign parameters? (#3)

About the new version, would it be fine for you to have a new version for your results? I'd include the setEf() function and the changes you previously suggested. I was supposed to work on it today but I got busy with some other stuff (and the release on maven public) :D.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/stepstone-tech/hnswlib-jna/issues/3#issuecomment-651896498, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AG27DQTTAQETQHPQKLVW7BLRZIFLXANCNFSM4OJ6BS3A.

ashfaq92 commented 4 years ago

The new version 1.2.0 comes with setEf function, so the problem of parameter assignment is solved (example)

hussamaa commented 4 years ago

Thanks @ashfaq92. To be honest, we didn't have the chance to evaluate the impact of setting different values of Ef. If you notice something weird like the behaviour not changing / same performance for different values, could you please let us know? Thanks a lot and I wish you a great day.