Closed AzizCode92 closed 5 years ago
or should I just keep the code as you did in the trainer.py?
related issues : issue1 in the meanwhile I did modify the batch_norm function so I can take care manually of the moving mean/variance during inference,
# Thanks to https://github.com/OlavHN/bnlstm
def batch_norm(inputs, name_scope, is_training, epsilon=1e-3, decay=0.99):
with tf.variable_scope(name_scope):
size = inputs.get_shape().as_list()[1]
scale = tf.get_variable(
'scale', [size], initializer=tf.constant_initializer(0.1))
offset = tf.get_variable('offset', [size])
population_mean = tf.get_variable(
'population_mean', [size],
initializer=tf.zeros_initializer(), trainable=False)
population_var = tf.get_variable(
'population_var', [size],
initializer=tf.ones_initializer(), trainable=False)
batch_mean, batch_var = tf.nn.moments(inputs, [0])
# The following part is based on the implementation of :
# https://github.com/cooijmanstim/recurrent-batch-normalization
train_mean_op = tf.assign(
population_mean,
population_mean * decay + batch_mean * (1 - decay))
train_var_op = tf.assign(
population_var, population_var * decay + batch_var * (1 - decay))
if is_training is True:
with tf.control_dependencies([train_mean_op, train_var_op]):
return tf.nn.batch_normalization(
inputs, batch_mean, batch_var, offset, scale, epsilon)
else:
return tf.nn.batch_normalization(
inputs, population_mean, population_var, offset, scale,
epsilon)
But it did not work properly ( early convergence )
@vrenkens : do you have an idea how can I fix this issue?
Based on https://arxiv.org/pdf/1603.09025.pdf I still have problem handling the update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS) in the trainer.py and I would be so happy if anyone can help me fix this issue.
Following the tensorflow documentation,
I tried this in the trainer.py inside the update function but still it throws me error.
the error message is