Closed jingliao132 closed 4 years ago
Hi @jingliao132 ,
sent2vec already provides an interface to use the embeddings:
import sent2vec
model = sent2vec.Sent2vecModel()
model.load_model('model.bin')
emb = model.embed_sentence("once upon a time .")
embs = model.embed_sentences(["first sentence .", "another sentence"])
You can then use the resulting embedding vectors in any way (e.g., as input for a sentence classifier).
Regarding the window size: Sent2vec uses dynamic context windows as described in the paper. If the question is wrt. the n-gram size, we use the default value (2).
Sorry for the late reply, we hope you were still able to resolve the issues.
Best, Ji-Ung
Thanks for your help. My issues are well-resolved.
Best regards, Jing
在 2020年9月8日,17:15,Ji-Ung Lee notifications@github.com 写道:
Hi @jingliao132 ,
sent2vec already provides an interface to use the embeddings:
import sent2vec model = sent2vec.Sent2vecModel() model.load_model('model.bin') emb = model.embed_sentence("once upon a time .") embs = model.embed_sentences(["first sentence .", "another sentence"]) You can then use the resulting embedding vectors in any way (e.g., as input for a sentence classifier).
Regarding the window size: Sent2vec uses dynamic context windows as described in the paper. If the question is wrt. the n-gram size, we use the default value (2).
Sorry for the late reply, we hope you were still able to resolve the issues.
Best, Ji-Ung
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.
Hello, UKPLab. I have download the pre-trained embeddings in binary format. How could it be made into an embedding vector based on certain vocabulary? Do you provide any interface like gensim?
Besides, I failed to find setup of CNN window size but the CNN-non-static in (Kim, 2014) work that you refer to suggest the window size should be defined. Is the tweet length fixed in input? If so, what is the window size value?
In general, I am not quite clear about how you preprocessing original tweets, could you please describe or recommend any resources about initializing embeddings for tweets?
Thanks a lot.