Continue training - Githubissues

bheinzerling / bpemb

Pre-trained subword embeddings in 275 languages, based on Byte-Pair Encoding (BPE)

MIT License

1.18k stars 101 forks source link

That section in the gensim documentation refers to continuing training with gensim specifically. I didn't train the embeddings with gensim, but with GloVe, so this section doesn't really apply here. If you want to continue training the embeddings (this is usually called "finetuning"), you can load them in a deep learning framework like PyTorch:

>>> from torch import nn, tensor
>>> from bpemb import BPEmb
>>> bpemb_en = BPEmb(lang="en", vs=100000, dim=100)
>>> emb_layer = nn.Embedding.from_pretrained(tensor(bpemb_en.vectors))
>>> emb_layer
Embedding(100000, 100)

bheinzerling / bpemb

Continue training #36