Open XuanheLiu opened 7 years ago
@XuanheLiu Did you solve the problem? I met the same
@wenbowen123 The problem has been solved, the learning of the first small, and so on loss value down, and then increase, and then turn smaller.
@XuanheLiu Thank you! How does the trained result looks like? How is the accuracy?
@wenbowen123 I don't really know how to train. The weight of my training is not as good as that of the author. I remember the value of loss didn't drop to very low, and I don't remember how much it was.
@wenbowen123 so how you solve the error i run into the same problem, thanks a lot
The model diverges if the training process changes the weights too much and the loss becomes larger or rather extremely huge. Try changing the standard distribution of the weight variables when initialized and the constant value of the bias variables so that the initial biases and weights are relatively small. You can also consider changing the learning rate. Model divergence is as far as I know caused by those (hyper-)paramters. Cheers
assert not np.isnan(loss_value), 'Model diverged with loss = NaN' AssertionError: Model diverged with loss = NaN
When I use Python3 to run this project, the trainer will be NaN, but when I use Python2 to run this project, the model was convergence. @XuanheLiu @Fju @nilboy
Does it make any sense that it works in Python 2 not in Python 3?????????
You need to compute and apply gradient seperately by the following process:
opt = tf.train.AdamOptimizer(0.1)
gvs = opt.compute_gradients(logits)
capped_gvs = [(tf.clip_by_value(grad, -1., 1.), var) for grad, var in gvs]
train_op = optimizer.apply_gradients(capped_gvs)
It can limit the range of computed gradient and prevent model from diverge.
No loading yolo_tiny.ckpt. Direct training yolo_net, it will occur error :AssertionError: Model diverged with loss = NaN.