How to Output Embedded Word Vector

codertimo / BERT-pytorch

Google AI 2018 BERT pytorch implementation

Apache License 2.0

6.09k stars 1.29k forks source link

How to Output Embedded Word Vector #68

Open enze5088 opened 4 years ago

enze5088 commented 4 years ago

I want to output the word vector

Vesauza commented 4 years ago

try this,

model = torch.load(your_model_file) vocab = WordVocab.load_vocab(your_vocab_file)

tokenEmb = model.state_dict()['embedding.token.weight'] segEmb = model.state_dict()['embedding.segment.weight'] posEmb = model.state_dict()['embedding.position.weight']

token_emb = tokenEmb[vocab.to_seq("word")[0]]

enze5088 commented 4 years ago

Thank you, but I want to get the vector corresponding to each word, so I'm a little confused about the weight matrix.

enze5088 commented 4 years ago

I see. Thank you very much.

enze5088 commented 4 years ago

Is vocab.to_seq("word") [0] the index corresponding to Word? Can we just take the value of the corresponding matrix directly?

rhypowang commented 4 years ago

appear an error ModuleNotFoundError: No module named 'model.bert'

rhypowang commented 4 years ago

How to Output Embedded Sentence Vector