Closed ashfaq92 closed 4 years ago
Hi @ashfaq92,
When initialize(int maxNumberOfElements) is called, we use the default ef, efConsruction and M defined in Hnswlib but there is also an overloaded version which allows you to customize these parameters.
/**
* Initialize the index to be used.
*
* @param maxNumberOfElements;
* @param m;
* @param efConstruction;
* @param randomSeed .
*
* @throws IndexAlreadyInitializedException when a index reference was initialized before.
* @throws UnexpectedNativeException when something unexpected happened in the native side.
*/
public void initialize(int maxNumberOfElements, int m, int efConstruction, int randomSeed) {
...
}
Thanks for your remark. I'll include that on our examples. Please let me know if you have any issues. Have a great weekend.
BR, Hussama
The example that you have mentioned, only uses efConstruction
and M
. Still, I could not find where to specify the value of ef
. The randomSeed
that is the last parameter of initialize
method, is ef
??
Hi @ashfaq92,
The last parameter of the function initialize is randomSeed indeed. We haven't exposed the setEf in our bindings. I'll expose it for you and release in the upcoming version. Would it be fine for you?
BR, Hussama
Definitely. Thanks for prompt replies and help. What is the default ef
it is using right now? I really want to change ef
values and record performance of method. ef
has a significant impact on recall, accuracy and performance.
I am successful to setup your code for research project. Initial experiments showed that this your implementation of HNSW is 74% faster than KDTrees in my research area. However, another pure java impelementation of HNSW was 90% faster. So the performance of your project may be a little slow due to ef
parameter. In other HNSW implementation, I set ef
to 2
. I think your project uses a high value of ef
.
Hi @ashfaq92,
Thanks you for the feedback and questions. I checked the native code and ef is initialized with value 10.
another pure java impelementation of HNSW was 90% faster.
Wow! Which library did you try out? We had a look at https://github.com/jelmerk/hnswlib some time ago the index building was quite slow for 5M+ items with 50+ dimension.
BR, Hussama
About the new version, would it be fine for you to have a new version for your results? I'd include the setEf() function and the changes you previously suggested. I was supposed to work on it today but I got busy with some other stuff (and the release on maven public) :D.
I could prepare everything for you by Thursday / Friday (max).
yeah sure. I also used the above mentioned library. Initial experiments were on 2 dimensions and 5000 vectors (euclidean metric space). So Once I go in larger dimensions and huge size of vectors, the performance difference may seem obvious.
Yeah, without ef, the experiments cant be made. Thanks again for dedication and prompt support
Get Outlook for Androidhttps://aka.ms/ghei36
From: Hussama Ismail notifications@github.com Sent: Wednesday, July 1, 2020 12:13:47 AM To: stepstone-tech/hnswlib-jna hnswlib-jna@noreply.github.com Cc: Muhammad Ashfaq ashfaq92@outlook.com; Mention mention@noreply.github.com Subject: Re: [stepstone-tech/hnswlib-jna] how to assign parameters? (#3)
About the new version, would it be fine for you to have a new version for your results? I'd include the setEf() function and the changes you previously suggested. I was supposed to work on it today but I got busy with some other stuff (and the release on maven public) :D.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/stepstone-tech/hnswlib-jna/issues/3#issuecomment-651896498, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AG27DQTTAQETQHPQKLVW7BLRZIFLXANCNFSM4OJ6BS3A.
The new version 1.2.0
comes with setEf
function, so the problem of parameter assignment is solved (example)
Thanks @ashfaq92. To be honest, we didn't have the chance to evaluate the impact of setting different values of Ef. If you notice something weird like the behaviour not changing / same performance for different values, could you please let us know? Thanks a lot and I wish you a great day.
As you know, hnsw uses
ef
,efConstruction
andM
paramters for graph construction and nearest neighbor search. I can not see in your given code examples how to assign values of these parameters. Please explainn