Closed kclarke1222 closed 5 years ago
NGT works for binary data with hamming distance as well, although I did not evaluate for hamming distance. When you use hamming distance for binary data, you have to specify 1 byte unsigned integer for the object type and hamming distance for the distance function with NGT create command. Also you should probably make the search range coefficient larger than its default for hamming distance.
how about for the jaccard distance? We are operating on binary data, and it is the more typical distance measure in chemistry applications.
Although I have not used the jaccard distance, I think that NGT can work for the jaccard distance as well. But, if the dimensionality of your data is large, NGT cannot effectively handle your data, because NGT cannot handle sparse data. Anyway since the distance is not implemented at this point, you have to implement it, if you want it in a hurry.
I added jaccard distance.
I was curios if these algorithms would work with binary data? I didn't see anything about that in the the papers on the algorithms.