Open sumit11112 opened 4 years ago
cdist expect a two dimensional array as input. Changing the code like this:
yy = scipy.spatial.distance.cdist([a1], [a2], "cosine")[0]
works for me.
Hi,
This might look stupid but as an confirmation. Can I persist above embedding array a1 and a2. Later incase of need again create iD array and do yy = scipy.spatial.distance.cdist([a1], [a2], "cosine")[0] it will work right?
As I am using 'distiluse-base-multilingual-cased' embeddings for a pre-trained french language will be close to that of english version?
Yes, you can persist the embeddings on disc and load them later. You can use pickle or numpy save/load functions.
Hi,
Below code is trowing error at cdist(a1, a2, 'cosine')[0][0]
How can I measure cosine similarity between sentences. I can not provide all strings in single array and on shot.
from sentence_transformers import SentenceTransformer, LoggingHandler from sentence_transformers import models, losses import numpy as np
model = SentenceTransformer('distiluse-base-multilingual-cased')
sentence_embeddings = model.encode(['This framework generates embeddings for each input sentence']]) a1 = 1 for sentence, embedding in zip(sentences, sentence_embeddings): print("Sentence:", sentence) print("Embedding:", embedding) a1 = embedding print("")
sentence_embeddings = model.encode(['Sentences are passed as a list of string.']]) a2 = 1 for sentence, embedding in zip(sentences, sentence_embeddings): print("Sentence:", sentence) print("Embedding:", embedding) a2 = embedding print("")
import scipy.spatial yy = scipy.spatial.distance.cdist(a1, a2, "cosine")[0] print(yy)
Y = cdist(a1, a2, 'cosine')[0][0] print(Y)