Closed fyw1999 closed 2 years ago
Hi! I'm sorry, but I'm not quite sure what I should respond here. Is the above an issue that affects model training or evaluation?
Hi! I'm sorry, but I'm not quite sure what I should respond here. Is the above an issue that affects model training or evaluation?
Hi! I'm sorry, but I'm not quite sure what I should respond here. Is the above an issue that affects model training or evaluation? Thanks so much for response! I am very interested in your work. However, when I debug the code, I found the gradients of variables from conv3-conv5 returned by "gvs = self.optimizer.compute_gradients(loss) " is none. So then when I double those none gradients I get an error and the program stops. ` # Define the loss loss = layers['total_loss']
lr = tf.Variable(cfg.TRAIN.LEARNING_RATE, trainable=False)
self.optimizer = tf.train.MomentumOptimizer(lr, cfg.TRAIN.MOMENTUM)
# Compute the gradients with regard to the loss
gvs = self.optimizer.compute_gradients(loss)
train_op = self.optimizer.apply_gradients(gvs)
# Double the gradient of the bias if set
if cfg.TRAIN.DOUBLE_BIAS:
final_gvs = []
with tf.variable_scope('Gradient_Mult') as scope:
for grad, var in gvs:
scale = 1.
if cfg.TRAIN.DOUBLE_BIAS and '/biases:' in var.name:
scale *= 2.
if not np.allclose(scale, 1.0):
grad = tf.multiply(grad, scale)
final_gvs.append((grad, var))
train_op = self.optimizer.apply_gradients(final_gvs)
else:
train_op = self.optimizer.apply_gradients(gvs)
# Initialize post-hist module of drl-RPN
if cfg.DRL_RPN.USE_POST:
loss_post = layers['total_loss_hist']
lr_post = tf.Variable(cfg.DRL_RPN_TRAIN.POST_LR, trainable=False)
self.optimizer_post = tf.train.MomentumOptimizer(lr, cfg.TRAIN.MOMENTUM)
gvs_post = self.optimizer_post.compute_gradients(loss_post)
train_op_post = self.optimizer_post.apply_gradients(gvs_post)
else:
lr_post = None
train_op_post = None
# Initialize main drl-RPN network
self.net.build_drl_rpn_network()
return lr, train_op, lr_post, train_op_post`
Ah, well that's been discussed in some other (closed) issue-thread. Have a look at closed issues. The conclusion is that I have not implemented code for training the backbone. That is kept frozen (it did not improve results to allow for tuning these as well). So the current code allows for the RL-training + fine-tuning the detector heads.
It seems to be this issue: https://github.com/aleksispi/drl-rpn-tf/issues/5
In file train_val.py, code "gvs = self.optimizer.compute_gradients(loss)"return gradient of None for conv3-conv5 in VGG. However, "trainable=is_training" equals "trainable=True" in funtion _image_to_head.