alexklibisz / elastiknn

Elasticsearch plugin for nearest neighbor search. Store vectors and run similarity search using exact and approximate algorithms.
https://alexklibisz.github.io/elastiknn
Apache License 2.0
362 stars 48 forks source link

Added dot product everywhere were cosine similarity was used #676

Open joancf opened 3 months ago

joancf commented 3 months ago

Related Issue

Support for inner product similarity measures

Changes

almost everywhere (except some tests) where there was a cosine reference a parallel dot function/option is added

What changed?

most files where cosine was used

Testing and Validation

Not done , needs to be done, it is pending. I'll try to generate and replace the plugin in my ES isntallation and check it. but not sure which prodedures I must follow.

joancf commented 3 months ago

Hi @alexklibisz First thanks for fast response, and the plug-in itself!! let me apologize , for sending the pull request before deeply testing it. It's my first time doing things in Scala, and I'm a bit confused on how to do some things. Finally i could compile and build the zip in my side, so I can try to run the plugin! (and check if it works and the performance) But as I said my knowledge of scala is my knowledge of Java. ... and for some of the things you ask me, i'm not able to do them. (basically testing is where I think it will take a me a while to understand everything!)

About the changes you asked. I did all of them . One I did in a different way was to ensure that the similarity was returning a positive value with max(0,1+dotProduct) In this way we don't raise an exception and negative values will have a 0 similartiy

Thanks Joan

alexklibisz commented 3 months ago

Hi @joancf can you try adding the exceptions for scores outside [0, 2]? If you're having trouble I can try this, but probably not til the weekend or next week.

joancf commented 3 months ago

hi. @alexklibisz my company doesn't want to use it. So, I will finish it out-hours I'll do my best with exceptions and testing.