Using pre embedded vectors can I save the GPU memory taking for the process ?

Conchylicultor / DeepQA

My tensorflow implementation of "A neural conversational model", a Deep learning based chatbot

Apache License 2.0

2.93k stars 1.17k forks source link

Using pre embedded vectors can I save the GPU memory taking for the process ? #91

Closed shamanez closed 7 years ago

shamanez commented 7 years ago

tf.contrib.legacy_seq2seq.embedding_rnn_seq2seq can embed each word in to desired dimensional vectors while the graph is running in a session. I don't have a clear idea on whether we can save the memory with using pre embedded word vectors.

Conchylicultor commented 7 years ago

For inference, it probably don't have any impact as the vocabulary won't change. It may reduce the memory usage during training as there is no gradient associated to the word embedding.

The main point of using pre-computed embedding is that it should speed-up the training and allows to train models on smaller dataset.

shamanez commented 7 years ago

So in the embedding_rnn_seq2seq function all vocabulary words are assigned to pre generated word vectors ?

Conchylicultor commented 7 years ago

No. The pre-generated word vector are assigned in the loadEmbedding fuction, in https://github.com/Conchylicultor/DeepQA/blob/master/chatbot/chatbot.py

shamanez commented 7 years ago

In the embedding_rnn_seq2seq function we can change the word embedding vector size. So the tensorflow api for this function has stated that this model will first embed the words in to vectors. What is the model they are using ? Like skipgram or glove ?

Or it's just initialize all the words first with N-dimensional vectors and trained them with the back prop during the training time ?

Conchylicultor commented 7 years ago

It's your second answer. There is no unsupervised pre-training for the word embedding. They are trained conjointly with the rest of the network.