ExplorerFreda / Structured-Self-Attentive-Sentence-Embedding

An open-source implementation of the paper ``A Structured Self-Attentive Sentence Embedding'' (Lin et al., ICLR 2017).
GNU General Public License v3.0
432 stars 97 forks source link

About GLOVE model #5

Open jx00109 opened 6 years ago

jx00109 commented 6 years ago

Recently, I have use torchtext to get the glove model, By this module I got the dictionary that maps word to index and the embedding matrix (shape word_count * dim, torch.FloatTensor), so to create the file which can be used in train.py, I write my code like this:

t=(dictionary, embedding matrix, dim)
torch.save(t, mypath/glove.pt)

Is the file glove.pt in the right format that asked in your program?

DenisDsh commented 6 years ago

This is how I created the GloVe model :


TEXT = data.Field(sequential=True) 
LABEL = data.Field(sequential=False)

train, val, test = data.TabularDataset.splits(
        path='./', train='train.json',
        validation='val.json', test='test.json', format='json',
        fields={'text': ('text', TEXT),
             'label': ('label', LABEL)})

TEXT.build_vocab(train, vectors="glove.42B.300d")

dictionary = TEXT.vocab.stoi
vectors = TEXT.vocab.vectors
dim = TEXT.vocab.vectors.size()[1] #300 in this case

torch.save(tuple([dictionary,vectors,dim]), './GloVe/glove.42B.300d.pt')

Took inspiration from :