caicloud / tensorflow-tutorial

Example TensorFlow codes and Caicloud TensorFlow as a Service dev environment.
2.93k stars 2.08k forks source link

第八章RNN程序疑惑 #63

Closed anty-zhang closed 6 years ago

anty-zhang commented 7 years ago
  1. 问题1

embedding = tf.get_variable("embedding", [VOCAB_SIZE, HIDDEN_SIZE])

将原本batch_size * num_steps 个单词ID转换为单词向量

    # 转换后的输入层维度为batch_size * num_size * HIDDEN_SIZE
    inputs = tf.nn.embedding_lookup(embedding, self.input_data)

*我看这两行的代码,embedding只是定义了一个VOCAB_SIZEHIDDEN_SIZE大小的矩阵,并没有进行初始化,这个地方通过embedding_lookup就转换为了input的单词向量,这个实际上是怎么办到的呢?**

  1. 问题2

def main(_):

获取原始数据

train_data, valid_data, test_data, _ = reader.ptb_raw_data("/Users/xxx/work5/tensorflow/data/ptb_dataset/simple-examples/data")
print len(train_data)

# 定义初始化函数
initializer = tf.random_uniform_initializer(-0.05, 0.05)
# 定义训练用的神经网络模型
with tf.variable_scope("language_model", reuse=None, initializer=initializer):
    train_model = PTBModel(True, TRAIN_BATCH_SIZE, TRAIN_NUM_STEP)

# 定义评测用的神经网络模型
with tf.variable_scope("language_model", reuse=True, initializer=initializer):
    eval_model = PTBModel(False, EVAL_BATCH_SIZE, EVAL_NUM_STEP)

with tf.Session() as sess:
    tf.initialize_all_variables().run()

    # 使用训练数据训练模型
    for i in range(NUM_EPOCH):
        print "In iteration: %d" % (i + 1)
        # 在训练数据上训练神经网络模型
        run_epoch(sess, train_model, train_data, train_model.train_op, True)

        # 使用验证数据评测模型效果
        valid_perplexity = run_epoch(sess, eval_model, valid_data, tf.no_op(), False)
        print "Epoch: %d validation perplexity: %.3f" % (i + 1, valid_perplexity)

    # 最后使用测试数据测试模型效果
    test_perplexity = run_epoch(sess, eval_model, test_data, tf.no_op(), False)

    print "Test perplexity: %.3f" % test_perplexity

还有这段代码,在测试数据上测试模型的效果test_perplexity = run_epoch(sess, eval_model, test_data, tf.no_op(), False) 为什么不用训练好的模型train_model,而使用验证数据集上的eval_model模型呢?

以上代码主要是参考您书上的第217-221页。 期望能够在空余时间帮忙解惑一下,非常感谢

HammondWen commented 7 years ago

同问。

perhapszzy commented 7 years ago

对于问题1:

embedding = tf.get_variable("embedding", [VOCAB_SIZE, HIDDEN_SIZE])

的定义是在下面的variable_scope里:

initializer = tf.random_uniform_initializer(-0.05, 0.05)
# 定义训练用的神经网络模型
with tf.variable_scope("language_model", reuse=None, initializer=initializer):
    train_model = PTBModel(True, TRAIN_BATCH_SIZE, TRAIN_NUM_STEP)

所以默认的initializer就是tf.random_uniform_initializer(-0.05, 0.05)

perhapszzy commented 7 years ago

对于问题2: 使用eval模型主要的变化是batch_size和step都为1,那这样可以处理任意长度的测试数据。

caicloud-bot commented 6 years ago

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

/lifecycle stale

caicloud-bot commented 6 years ago

Stale issues rot after 30d of inactivity. Mark the issue as fresh with /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

/lifecycle rotten /remove-lifecycle stale

caicloud-bot commented 6 years ago

Rotten issues close after 30d of inactivity. Reopen the issue with /reopen. Mark the issue as fresh with /remove-lifecycle rotten.

/close