Cadene / skip-thoughts.torch

Porting of Skip-Thoughts pretrained models from Theano to PyTorch & Torch7
149 stars 19 forks source link

How can i get a sentence representation #8

Open gkv91 opened 5 years ago

gkv91 commented 5 years ago

Hi, Can you please tell me, how can I extract a single feature vector given a sentence in textual form or a list of word2vec vectors. Thanks

Cadene commented 5 years ago

Is this sufficient? https://github.com/Cadene/skip-thoughts.torch/tree/master/pytorch#quick-example

gkv91 commented 5 years ago

So first we need to make a vocabulary of all the possible words?. Instead of making a vector of word indices (eg. [1,2,3,4,0] in the example), can I use the word2vec embeddings (input as a 5x300 tensor)?

Cadene commented 5 years ago

@gkv91

The dictionary of all the words that can be associated to an embedding is available here.

As it contains too much word for my task, I prefer to reduce it / create my own.

Also, note that the skipthought model was trained using its own embedding layer initialized with the word2vec embeddings.