bheinzerling / bpemb

Pre-trained subword embeddings in 275 languages, based on Byte-Pair Encoding (BPE)
https://nlp.h-its.org/bpemb
MIT License
1.18k stars 101 forks source link

The index for <unk> is 0, so what about <pad>? #26

Closed ghost closed 5 years ago

ghost commented 5 years ago

Because < PAD > often uses 0.