High dimensional prediction targets for word2word training

titipata / yelp_dataset_challenge

Play around with Yelp dataset in Python (in progress and very messy repo)

http://www.yelp.com/dataset_challenge

19 stars 6 forks source link

Open z-zawhtet-a opened 8 years ago

z-zawhtet-a commented 8 years ago

http://arxiv.org/pdf/1412.7091.pdf 200000 softmax outputs is just too many!

daniel-acuna commented 8 years ago

This is very cool paper! I am wondering if any of the packages for DNN implement something like this

daniel-acuna commented 8 years ago

An alternative is to approximate this softmax with negative sampling (http://stackoverflow.com/questions/27860652/word2vec-negative-sampling-in-layman-term) or another software approximation like the ones used in tensorflow. In tensorflow, they have the negative sampling function tf.nn.sampled_softmax_loss: https://www.tensorflow.org/versions/master/api_docs/python/nn.html#candidate-sampling

Probably Keras has something like that too...