bheinzerling / bpemb

Pre-trained subword embeddings in 275 languages, based on Byte-Pair Encoding (BPE)
https://nlp.h-its.org/bpemb
MIT License
1.18k stars 101 forks source link

Is the training procedure open? #62

Closed utrobinmv closed 2 years ago

utrobinmv commented 2 years ago

Thank you! I came across your embedding corpus. It's interesting enough, but I didn't find the training procedure?

Is this library open enough that I could train it on my text corpus, like fastText?

How can i do this?