opendistro-for-elasticsearch / k-NN

🆕 A machine learning plugin which supports an approximate k-NN search algorithm for Open Distro.
https://opendistro.github.io/
Apache License 2.0
277 stars 55 forks source link

Failing KNNJNITests test #308

Closed jaume-ferrarons closed 3 years ago

jaume-ferrarons commented 3 years ago

I'm trying to build the k-NN from sources v1.12.0.0 https://github.com/opendistro-for-elasticsearch/k-NN/releases

How to reproduce it:

curl -L https://github.com/opendistro-for-elasticsearch/k-NN/archive/v1.12.0.0.tar.gz -o ./knn-v1.12.0.0.tar.gz
tar -xzf knn-v1.12.0.0.tar.gz

# Get nmslib
cd /home/elasticsearch/k-NN-1.12.0.0/jni/external
git clone https://github.com/nmslib/nmslib.git
cd /home/elasticsearch/k-NN-1.12.0.0/jni/external/nmslib
git checkout a2d6624e13

cd /home/elasticsearch/k-NN-1.12.0.0

# Start plugin build
./gradlew
./gradlew build

Error obtained:

2> REPRODUCE WITH: ./gradlew ':test' --tests "com.amazon.opendistroforelasticsearch.knn.index.KNNJNITests.testQueryHnswIndex" -Dtests.seed=97B922DFF419C552 -Dtests.security.manager=false -Dtests.locale=zh-Hant-TW -Dtests.timezone=America/Danmarkshavn -Druntime.java=14
  2> java.lang.AssertionError: expected:<126.0> but was:<11.224972>

Indeed, checking the code, the euclidean distance from [1,1,1,1] to [5,6,7,8] is sqrt((5-1)^2 + (6-1)^2 + (7-1)^2 + (8-1)^2) = 11.224. The test expects to be 126, that is the square of the euclidean distance.

Given that the comments say scores are evaluated using Euclidean distance I'd expect the right value to be 11.224. Indeed the tests were correct before this PR: https://github.com/opendistro-for-elasticsearch/k-NN/pull/291/files#diff-0f493d0f86a1b19c9bc913d73cdd2c39e1e61c0651622035ce4af91563869821L102-R111

Also this test seems to be failing for the same reason:

2> REPRODUCE WITH: ./gradlew ':test' --tests "com.amazon.opendistroforelasticsearch.knn.index.KNNJNITests.testQueryHnswIndexWithValidAlgoParams" -Dtests.seed=97B922DFF419C552 -Dtests.security.manager=false -Dtests.locale=zh-Hant-TW -Dtests.timezone=America/Danmarkshavn -Druntime.java=14
  2> java.lang.AssertionError: expected:<126.0> but was:<11.224972>

For similar reason:

  2> REPRODUCE WITH: ./gradlew ':test' --tests "com.amazon.opendistroforelasticsearch.knn.index.codec.KNN87Codec.KNN87CodecTests.testMultiFieldsKnnIndex" -Dtests.seed=97B922DFF419C552 -Dtests.security.manager=false -Dtests.locale=mt-MT -Dtests.timezone=Europe/Warsaw -Druntime.java=14
  2> java.lang.AssertionError: expected:<0.03846154> but was:<0.1667>

Thank you for your help.

jmazanec15 commented 3 years ago

Hi @jaume-ferrarons,

I believe the issue is that the 1.12.0.0 release expects nmslib 2.0.6, and you are using nmslib 2.0.11. Reverting to nmslib 2.0.6 will solve the issue.

If you want to use nmslib 2.0.11, you can checkout the opendistro-1.12 code (which is ahead of v1.12.0.0 tag) and build from source:

git clone https://github.com/opendistro-for-elasticsearch/k-NN.git
cd k-NN
git fetch
git checkout opendistro-1.12
./gradlew build
jaume-ferrarons commented 3 years ago

Hello @jmazanec15,

Thank you, that solved the issue and finished the build!

Still, it may be worth to review the tests to make sure the description match with the test.