Normalizes vectors for queries that come back from the Cohere API
Switches to using dot-product for vector similarity
After an investigation I found the vectors from the msmarco-v2 dataset cohere published are normalised, but the queries from their API, despite their API docs saying to the contrary, are not, and another ingestion issue made it look like it was the dataset and not the queries in CI.
Dataset file sizes changed in GCP, and so in track.json, due to playing around with normalisation, but is the same.
This:
dot-product
for vector similarityAfter an investigation I found the vectors from the msmarco-v2 dataset cohere published are normalised, but the queries from their API, despite their API docs saying to the contrary, are not, and another ingestion issue made it look like it was the dataset and not the queries in CI.
Dataset file sizes changed in GCP, and so in
track.json
, due to playing around with normalisation, but is the same.