dselivanov / text2vec

Fast vectorization, topic modeling, distances and GloVe word embeddings in R.
http://text2vec.org
Other
852 stars 136 forks source link

Hash Embeddings for Efficient Word Representations #223

Closed zachmayer closed 6 years ago

zachmayer commented 6 years ago

Hashing that uses hashes to make sure important words are hashed to different buckets: http://papers.nips.cc/paper/7078-hash-embeddings-for-efficient-word-representations.pdf

Seems like a pretty good idea

dselivanov commented 6 years ago

Thanks for suggestion. I'm pretty sure I will not work on this near time. If someone want to try - please reopen.

YannDubs commented 6 years ago

@zachmayer if PyTorch is something you can use, you can check my implementation of it for the NIPS implementation challenge: https://github.com/YannDubs/Hash-Embeddings