Closed joonable closed 6 years ago
Hi @joonable , Thanks for asking.
I just checked this out briefly. I think I'll need more information as I cannot replicate the problem. For example, if I run the code through the variable initialization and create a feed-dictionary, then I run the following commands:
In[39]: sess.run(doc_embed, feed_dict=feed_dict)
Out[39]:
array([[[ 0.36113167, -0.42523894, 0.08636531, ..., 0.9411001 ,
-0.8095024 , -0.38859203]],
[[ 0.36113167, -0.42523894, 0.08636531, ..., 0.9411001 ,
-0.8095024 , -0.38859203]],
[[ 0.36113167, -0.42523894, 0.08636531, ..., 0.9411001 ,
-0.8095024 , -0.38859203]],
...,
[[ 0.7726636 , -0.4221473 , -0.28463227, ..., -0.00291947,
0.49912193, -0.26189896]],
[[ 0.7726636 , -0.4221473 , -0.28463227, ..., -0.00291947,
0.49912193, -0.26189896]],
[[ 0.7726636 , -0.4221473 , -0.28463227, ..., -0.00291947,
0.49912193, -0.26189896]]], dtype=float32)
In[40]: sess.run(train_step, feed_dict=feed_dict)
In[41]: sess.run(doc_embed, feed_dict=feed_dict)
Out[41]:
array([[[ 0.3611314 , -0.42523894, 0.08636572, ..., 0.94110006,
-0.8095023 , -0.38859165]],
[[ 0.3611314 , -0.42523894, 0.08636572, ..., 0.94110006,
-0.8095023 , -0.38859165]],
[[ 0.3611314 , -0.42523894, 0.08636572, ..., 0.94110006,
-0.8095023 , -0.38859165]],
...,
[[ 0.7726636 , -0.42214715, -0.2846323 , ..., -0.00291951,
0.49912196, -0.26189905]],
[[ 0.7726636 , -0.42214715, -0.2846323 , ..., -0.00291951,
0.49912196, -0.26189905]],
[[ 0.7726636 , -0.42214715, -0.2846323 , ..., -0.00291951,
0.49912196, -0.26189905]]], dtype=float32)
This shows me that the variable doc_embed
is changing due to the training. Are you seeing something different? If so, make sure you have the most up-to-date code and also let me know your python and TensorFlow versions.
I'll continue to troubleshoot with you if you see something different. I think the next step would be to fix a random seed for TensorFlow and Numpy and see what we can do, assuming we have the same versions for everything. For reference, I'm running Python 3.6 and TensorFlow v1.10.1.
Thanks.
I checked it as the way you presented and there is surely a difference after the training. Many appolgies add the code below.
`
In[32]: doc_origin = doc_embeddings.eval(sess)
In[33]: for i in range(5000) : sess.run(train_step, feed_dict=feed_dict)
In[34]: doc_eval = doc_embeddings.eval(sess)
In[35]: doc_origin - doc_eval
Out[35]:
array([[ 7.57873058e-05, 5.39273024e-05, -3.54051590e-05, ...,
3.53846699e-05, 4.13656235e-05, -6.90221786e-05],
[-1.03056431e-04, 3.75509262e-06, 6.04987144e-05, ...,
-3.19242477e-04, -1.00910664e-04, -1.72302127e-04],
[-1.60932541e-06, 3.51667404e-06, -8.94069672e-07, ...,
-2.14576721e-06, -4.35113907e-06, 1.75833702e-06],
...,
[-1.67489052e-04, 1.15454197e-04, 2.23517418e-05, ...,
-2.02655792e-06, -8.34465027e-06, 5.33461571e-05],
[-4.18424606e-05, -1.31400302e-05, -2.86102295e-05, ...,
1.12056732e-05, -6.37024641e-06, 4.05311584e-06],
[-1.25169754e-05, -2.87890434e-05, 1.23977661e-05, ...,
-1.21593475e-05, -6.26444817e-05, 5.59091568e-05]], dtype=float32)
`
It's not about troubleshooting, but I have a problem to solve. I'm using doc2vec for clustering unlabelled documents. However, as you see, the difference is too small that they just stay in random.uniform as initialised. I trained them with enough iterations then the losses at every step don't converge anymore though.
Even after more than 200K iterations, doc2vec don't changed a lot.
`
In[36]: for i in range(200000) : sess.run(train_step, feed_dict=feed_dict) In[37]: doc_eval_200K = doc_embeddings.eval(sess) In[38]: doc_origin Out[38]: array([[-0.40346146, -0.22738123, 0.6981292 , ..., 0.02518272, 0.6519067 , 0.5756016 ], [-0.71823335, 0.9682684 , -0.47529078, ..., -0.44264603, -0.84275126, 0.1408112 ], [-0.91523314, 0.63673115, 0.33543396, ..., -0.635123 , 0.8932848 , -0.0469408 ], ..., [-0.95611143, 0.63165283, 0.20844555, ..., -0.95574784, 0.803643 , 0.8626468 ], [-0.87971663, -0.00883818, 0.8690052 , ..., -0.9107895 , 0.11327219, 0.52236867], [ 0.9117298 , 0.5722585 , 0.87356305, ..., -0.65226054, -0.31751704, -0.7709594 ]], dtype=float32) In[39]: doc_eval_200K Out[39]: array([[-0.40350893, -0.22704063, 0.6981595 , ..., 0.02526901, 0.6520689 , 0.57503295], [-0.71688116, 0.9653056 , -0.47172707, ..., -0.44319224, -0.83652633, 0.13944209], [-0.91519636, 0.6366936 , 0.335434 , ..., -0.6351552 , 0.89335924, -0.04687748], ..., [-0.9555648 , 0.6306714 , 0.20914698, ..., -0.955652 , 0.8043847 , 0.86161727], [-0.8796184 , -0.00869403, 0.8691123 , ..., -0.91070646, 0.11326376, 0.52240765], [ 0.91199374, 0.57255834, 0.8732707 , ..., -0.65196055, -0.3172496 , -0.7709833 ]], dtype=float32)
`
When to use gensim, I can see clear difference but I should use tensorflow for my research to transform the algorithm. If you give me any advice, will be definitely helpful. Thank you.
Hi @joonable , They do change very slowly, I agree. You can try a few things:
I hope that helps!
Hello. I'm trying to use doc2vec algorithm in 07_Natural_Language_Processing/07_Sentiment_Analysis_With_Doc2Vec/07_sentiment_with_doc2vec.py.
I understood that first of the training is to train word and doc embedding and second one is for text classification, sentiment analysis. Because I needed distributed representations of words and docs, not a classifier, so just did first training.
After the training, I evaluated the vectors in word and document embedding using tf.saver, then found out doc embedding didn't change, but word embedding did. Doc embedding just stayed as initial value.
Did I understand the code and doc2vec algorithm not properly or is there any kind of bug in the code? Thank you for your answer in advanced.