Open surabhisnath opened 2 weeks ago
@surabhisnath This is clearly a bug in the HuggingFace port of the model. I'm going to investigate and fix it for you. In the meantime, if you load the model through this library it will work correctly!
Thank you. Sounds great. Please let me know when fixed :)
Hi,
I am trying to get contextual text embeddings. For example, get the embedding for a string "cat" under multiple different contexts for example (1) context "pets", (2) context "nuclear physics", etc. I want to investigate how the distances between strings differ based on the context (for instance, we expect the distance between "cat" and "dog" to be different in the context of pets vs in the context of nuclear physics).
I tried to use your model by using various context texts to get
dataset_embeddings
, and then embed my strings to obtaindoc_embeddings
under eachdataset_embeddings
. As follows:However, I find all
doc_embeddings
, to be all exactly the same - ie,doc_embeddings
are the same irrespective ofdataset_embeddings
.Is that expected or am I doing something wrong here? How else could I achieve the behaviour I expect with your model?
Thanks!