stanfordnlp / GloVe

Software in C and data files for the popular GloVe model for distributed word representations, a.k.a. word vectors or embeddings
Apache License 2.0
6.86k stars 1.51k forks source link

Idea - rebuild glove vectors #165

Open AngledLuffa opened 4 years ago

AngledLuffa commented 4 years ago

Goal: build new glove vectors with current vocab

wikipedia + gigaword maybe common crawl and/or twitter as well

could look at attardi's wikipedia cleaner

note for internal use: /u/downloads/data

Big-Tree commented 2 years ago

This would be very useful. I want something up to date and I can't seem to find any alternatives.