SthPhoenix / elastik-nearest-neighbors-extended

Elasticsearch plugin for approximate K-nearest-neighbors on floating-point vectors. Extended version.
13 stars 4 forks source link

ScriptDocValues$Doubles cannot be cast to class VectorScriptDocValues$DenseVectorScriptDocValues #4

Open etudor opened 5 years ago

etudor commented 5 years ago

I'm using ES 7.3.1. I have updated the plugin to compile to 7.3.1 version. I created an index with this config

aknn_tables = 64
aknn_bits = 18
aknn_dimensions = 1280

I followed the steps in your demo to create and index ~70K docs. The mapping

{
            "properties": {
                "_aknn_vector": {
                    "type": "dense_vector", // here I tried both with "half_float" and "dense_vector" 
                    "dim": 1280,
                    "index": False
                },
                "id": {
                    "type": "keyword"
                }
            }
        }

But when searching I'm getting the following error

"stacktrace": ["org.elasticsearch.transport.RemoteTransportException: [2a0c22bdf494][172.18.0.7:9300][indices:data/read/search[phase/query]]",
"Caused by: org.elasticsearch.script.ScriptException: runtime error",
"at org.elasticsearch.painless.PainlessScript.convertToScriptException(PainlessScript.java:94) ~[?:?]",
"at org.elasticsearch.painless.PainlessScript$Script.execute((1.0 + dotProduct(params.queryVector, doc[params.vector])) / 2.0:42) ~[?:?]",
"at org.elasticsearch.common.lucene.search.function.ScriptScoreFunction$1.score(ScriptScoreFunction.java:86) ~[elasticsearch-7.3.1.jar:7.3.1]",
"at org.elasticsearch.common.lucene.search.function.ScriptScoreQuery$1$1.score(ScriptScoreQuery.java:89) ~[elasticsearch-7.3.1.jar:7.3.1]",
"at org.apache.lucene.search.QueryRescorer.rescore(QueryRescorer.java:98) ~[lucene-core-8.1.0.jar:8.1.0 dbe5ed0b2f17677ca6c904ebae919363f2d36a0a - ishan - 2019-05-09 19:34:03]",
"at org.elasticsearch.search.rescore.QueryRescorer.rescore(QueryRescorer.java:74) ~[elasticsearch-7.3.1.jar:7.3.1]",
"at org.elasticsearch.search.rescore.RescorePhase.execute(RescorePhase.java:47) ~[elasticsearch-7.3.1.jar:7.3.1]",
"at org.elasticsearch.search.query.QueryPhase.execute(QueryPhase.java:116) ~[elasticsearch-7.3.1.jar:7.3.1]",
"at org.elasticsearch.search.SearchService.loadOrExecuteQueryPhase(SearchService.java:335) ~[elasticsearch-7.3.1.jar:7.3.1]",
"at org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:360) ~[elasticsearch-7.3.1.jar:7.3.1]",
"at org.elasticsearch.search.SearchService.lambda$executeQueryPhase$1(SearchService.java:340) ~[elasticsearch-7.3.1.jar:7.3.1]",
"at org.elasticsearch.action.ActionListener.lambda$map$2(ActionListener.java:145) ~[elasticsearch-7.3.1.jar:7.3.1]",
"at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:62) [elasticsearch-7.3.1.jar:7.3.1]",
"at org.elasticsearch.search.SearchService$2.doRun(SearchService.java:1052) [elasticsearch-7.3.1.jar:7.3.1]",
"at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-7.3.1.jar:7.3.1]",
"at org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:44) [elasticsearch-7.3.1.jar:7.3.1]",
"at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:758) [elasticsearch-7.3.1.jar:7.3.1]",
"at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-7.3.1.jar:7.3.1]",
"at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]",
"at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]",

Caused by: java.lang.ClassCastException: class org.elasticsearch.index.fielddata.ScriptDocValues$Doubles cannot be cast to class 
org.elasticsearch.xpack.vectors.query.VectorScriptDocValues$DenseVectorScriptDocValues 
(org.elasticsearch.index.fielddata.ScriptDocValues$Doubles is in unnamed module of loader 'app'; 
org.elasticsearch.xpack.vectors.query.VectorScriptDocValues$DenseVectorScriptDocValues is in 
unnamed module of loader java.net.FactoryURLClassLoader @c3177d5)",

The query:

{
    "_index": "embs",
    "_type": "twitter_images",
    "_aknn_uri": "aknn_models/_doc/twitter_images",
    "query_aknn": {
            "_aknn_vector": [], # this is filled with an embed vector
            "k1":1000,
            "k2":100
     }
}

Initially, I thought that the issue comes from the type of _aknn_vector. As it was type half_float , I changed it to dense_vector but still the same error.

I don't understand what I'm missing? I'm doing the query wrong? Or maybe the problem comes from the fact that I'm using ES 7.3.1 but the plugin initially is for 7.3.0?

SthPhoenix commented 5 years ago

Hi! I haven't updated my cluster to ES 7.3.1 yet, so can't say if it's caused by ES version. Intended data type for latest version of plugin is dense_vector, but AFAIK in current ES versions max dim is hardcoded to 1024, try indexing some lower dim data to see if error persists.

P.S. Just out of curiosity, if it's not a secret, which NN produces 1280d embedding?

etudor commented 5 years ago

I'm using Mobilenet trained on Imagenet from tensorflow hub https://tfhub.dev/google/imagenet/mobilenet_v2_035_224/feature_vector/1