Open havingfun opened 4 years ago
Resurrecting this. These models have enormous vocabs that could prove useful for more esoteric problems, would love to be able to use them easily.
Sure, why not. I'm +1 on including those.
Please check https://github.com/RaRe-Technologies/gensim-data#want-to-add-a-new-corpus-or-model; we'll need:
a) Text that motivates adding each model (should be easy), including any links to its original research and preprocessing options, its license etc. Basically a quick summary of "What is this?' and "Who is it for?"
b) Code that loads these models (to include in __init__.py
; see e.g. fasttext-wiki-news-subwords-300). Again, should be easy, IIRC we already support the gloVe data format.
Cheers!
Hi Team,
I see that we don't have two of the models from the pretrained models by Stanford from here - https://nlp.stanford.edu/projects/glove/ The ones that can be added are -
Thanks, Rajesh