koaning / embetter

just a bunch of useful embeddings
https://koaning.github.io/embetter/
MIT License
469 stars 15 forks source link

Revistit constrastive finetuner #73

Closed koaning closed 1 year ago

koaning commented 1 year ago

It seems I glanced over something, which might help explain the benchmarks.

For each sentence pair, we pass sentence A and sentence B through our network which yields the embeddings u und v. The similarity of these embeddings is computed using cosine similarity and the result is compared to the gold similarity score. This allows our network to be fine-tuned and to recognize the similarity of sentences.

It's using cosine similarity when it's comparing against the similarity ... which isn't what we are doing.

koaning commented 1 year ago

It also totally makes sense. If you're interested in "search" then you want to make sure that you're able to leverage cosine similarity. That also means that you gotta use it during training.

This line needs updating.