bheinzerling / bpemb

Pre-trained subword embeddings in 275 languages, based on Byte-Pair Encoding (BPE)
https://nlp.h-its.org/bpemb
MIT License
1.18k stars 101 forks source link

Subword vectors to word vector #39

Closed susmoy-macgill36 closed 4 years ago

susmoy-macgill36 commented 4 years ago

How to get a word vector from subword vectors? By averaging?

bheinzerling commented 4 years ago

Yes, averaging is a possible choice, but whether it is the best choice depends on your particular application. Please also see my comment here: https://github.com/bheinzerling/bpemb/issues/29#issuecomment-499333215