Closed iamsainianuj closed 3 years ago
Hi @iamsainianuj,
I think your issue is related to your use of KeyedVectors
, not laserembeddings.
Here's what I would suggest to debug:
Lang_based_keys
actually contains different vectorsKeyedVectors
Hope this helps (even one year later 😅).
I'm closing the issue, feel free to re-open if needed.
I really loved your work of porting the LASER as python pip package, I am new and trying to learn the use of these embeddings.
what i have done so far:
I have a list of sentences in three languages(L_1,L_2,L_3 let's say).
Generated embeddings as shown below:
assuming that the index of embeddings is as per their corresponding sentences in final.
now for finding similarity between embeddings what i have done is converted these embeddings into gensim KeyedVectors so that we have the flexibility of using the functions like similar_by_vector() etc.
but here i am having the issue, suppose a sentence which was present in the final at the time of generating embeddings "The iPhone SDK, set programming tools developers, enhanced support development iPad"..
but when i try to see what are the closest vectors in the embedding space to the vector of given sentence as follows:
let word = "The iPhone SDK, set programming tools developers, enhanced support development iPad"
what is see is the closest one given by the model is.
[('Poborsky played minutes, 291 minutes, Czech Republic Euro 2004', 1.0), .... ..]
how is it possible that the similarity was 1.0 which means both these sentences have same vector...
Kindly correct me wherever i am wrong..
Thank you.