lior-k / fast-elasticsearch-vector-scoring

Score documents using embedding-vectors dot-product or cosine-similarity with ES Lucene engine
Apache License 2.0
395 stars 112 forks source link

Read string field for Document #41

Closed KTOIA closed 4 years ago

KTOIA commented 4 years ago

Hi, Lior. I would like to make a similarity search, not for a vector of a number, but for a vector of objects (with a name for each of numbers: [{"name":"a", "value": 0.42},{"name":"b", "value": 0.52}]), is any way to read string value like BinaryDocValues way?

lior-k commented 4 years ago

No. The plugin expects a comma separated list of floats after it decodes the Base64 embedding.

On Sun, Feb 16, 2020, 11:37 AM KTOIA notifications@github.com wrote:

Hi, Lior. I would like to make a similarity search, not for a vector of a number, but for a vector of objects (with a name for each of numbers: [{"name":"a", "value": 0.42},{"name":"b", "value": 0.52}]), is any way to read string value like BinaryDocValues way?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/lior-k/fast-elasticsearch-vector-scoring/issues/41?email_source=notifications&email_token=ABGGISATJB4G46NKHJKCPKTRDECO5A5CNFSM4KWBMJPKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4IN2TKCQ, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABGGISFQZRG7VFARVPWEQN3RDECO5ANCNFSM4KWBMJPA .

KTOIA commented 4 years ago

Yes, I understand, that current implementation doesn't support this approach, but maybe you know, how to read a string field. I only found the way to get the whole document as a string and convert it to Object, but it dramatically inefficient.

KTOIA commented 4 years ago

I found: leafContext.reader().getSortedSetDocValues(fieldName); It's good for me, thank you for your quick answer!