outputs = []
state = self.initial_state
with tf.variable_scope("RNN"):
for time_step in range(num_steps):
if time_step > 0: tf.get_variable_scope().reuse_variables()
cell_output, state = cell(inputs[:, time_step, :], state)
outputs.append(cell_output)
if time_step > 0: tf.get_variable_scope().reuse_variables()
这句到底reuse的是什么呢?一定需要写这一行吗? 我在代码中打印了所有的trainable variables, 结果如下(True代表is_training): trainable name: language_model/embedding:0 True trainable name: language_model/RNN/multi_rnn_cell/cell_0/basic_lstm_cell/kernel:0 True trainable name: language_model/RNN/multi_rnn_cell/cell_0/basic_lstm_cell/bias:0 True trainable name: language_model/weight:0 True trainable name: language_model/bias:0 Truetrainable name: language_model/embedding:0 False trainable name: language_model/RNN/multi_rnn_cell/cell_0/basic_lstm_cell/kernel:0 False trainable name: language_model/RNN/multi_rnn_cell/cell_0/basic_lstm_cell/bias:0 False trainable name: language_model/weight:0 False trainable name: language_model/bias:0 False
可以看出,所有变量都在variable_scope("language_model")下(在main函数中定义),而这个scope在
is_training=False
时使用了reuse=True
, 那这些变量在Train,Valid, Test时使用的都是一套变量, variable_scope("RNN")嵌套在variable_scope("language_model")下,自然也继承了reuse=True, 何必再写if time_step > 0: tf.get_variable_scope().reuse_variables()
呢?注:我代码只有一行和书中的代码不一样,这个是搜索到了官网的示例程序后改的,改动如下:
cell = tf.nn.rnn_cell.MultiRNNCell([lstm_cell for _ in range(NUM_EPOCH)])
顺便请教一下书中的写法和上面写法的区别,在这里先感谢您了,希望能得到您的解答,谢谢~ tensorflow rnn_example ptb_word_lm.py