Closed shamanez closed 7 years ago
For inference, it probably don't have any impact as the vocabulary won't change. It may reduce the memory usage during training as there is no gradient associated to the word embedding.
The main point of using pre-computed embedding is that it should speed-up the training and allows to train models on smaller dataset.
So in the embedding_rnn_seq2seq function all vocabulary words are assigned to pre generated word vectors ?
No. The pre-generated word vector are assigned in the loadEmbedding
fuction, in https://github.com/Conchylicultor/DeepQA/blob/master/chatbot/chatbot.py
In the embedding_rnn_seq2seq function we can change the word embedding vector size. So the tensorflow api for this function has stated that this model will first embed the words in to vectors. What is the model they are using ? Like skipgram or glove ?
Or it's just initialize all the words first with N-dimensional vectors and trained them with the back prop during the training time ?
It's your second answer. There is no unsupervised pre-training for the word embedding. They are trained conjointly with the rest of the network.
tf.contrib.legacy_seq2seq.embedding_rnn_seq2seq can embed each word in to desired dimensional vectors while the graph is running in a session. I don't have a clear idea on whether we can save the memory with using pre embedded word vectors.