Sorry not sure where to put this, but I thought I should mention I wrote a C#/.NET port of this as part of my fork of Word2vec.Tools.
Pros / features:
appears to work correctly
no 2GB limit (tested with 5GB index file)
search_k support
To do / cons:
needs to be optimized (running gensim within a Docker image appears to be much faster)
doesn't yet use a Memory Mapped File
needs unit tests to verify correct results
needs performance tests (and compare to C and Java versions)
can only read an index; cannot create one (same as the Java version)
some messy "scaffolding" comments left over from the porting process need to be cleaned up / deleted
I haven't had time to work on it for a while so I thought I'd mention it in case anyone wanted to pick up the project or is looking for a starting point for their own C# port, or as reference if someone wants to backport features to the Java version.
Sorry not sure where to put this, but I thought I should mention I wrote a C#/.NET port of this as part of my fork of Word2vec.Tools.
Pros / features:
To do / cons:
I haven't had time to work on it for a while so I thought I'd mention it in case anyone wanted to pick up the project or is looking for a starting point for their own C# port, or as reference if someone wants to backport features to the Java version.
https://github.com/quole/Word2vec.Tools/blob/master/Word2vec.Tools/AnnoyIndex.cs