Learning rate become 0 - Githubissues

vivekdeepquanty commented 4 years ago

my lr was 0.0001

But after some epoch it become zero.

Epoch: [15] [ 0/209] eta: 0:03:06 lr: 0.000000 loss: 0.5737 (0.5737) loss_classifier: 0.0601 (0.0601) loss_box_reg: 0.0831 (0.0831) loss_mask: 0.4023 (0.4023) loss_objectness: 0.0062 (0.0062) loss_rpn_box_reg: 0.0221 (0.0221) time: 0.8938 data: 0.2370 max mem: 6450 Epoch: [15] [ 10/209] eta: 0:02:13 lr: 0.000000 loss: 0.5818 (0.6080) loss_classifier: 0.0609 (0.0621) loss_box_reg: 0.0782 (0.0759) loss_mask: 0.4273 (0.4496) loss_objectness: 0.0061 (0.0073) loss_rpn_box_reg: 0.0119 (0.0132) time: 0.6731 data: 0.0303 max mem: 6450 Epoch: [15] [ 20/209] eta: 0:02:05 lr: 0.000000 loss: 0.5848 (0.5937) loss_classifier: 0.0595 (0.0620) loss_box_reg: 0.0693 (0.0756) loss_mask: 0.4273 (0.4355) loss_objectness: 0.0060 (0.0068) loss_rpn_box_reg: 0.0118 (0.0138) time: 0.6527 data: 0.0096 max mem: 6450 Epoch: [15] [ 30/209] eta: 0:01:59 lr: 0.000000 loss: 0.5848 (0.5950) loss_classifier: 0.0616 (0.0626) loss_box_reg: 0.0710 (0.0762) loss_mask: 0.4182 (0.4338) loss_objectness: 0.0065 (0.0087) loss_rpn_box_reg: 0.0106 (0.0137) time: 0.6611 data: 0.0098 max mem: 6450 Epoch: [15] [ 40/209] eta: 0:01:50 lr: 0.000000 loss: 0.5718 (0.5921) loss_classifier: 0.0639 (0.0642) loss_box_reg: 0.0767 (0.0768) loss_mask: 0.4173 (0.4295) loss_objectness: 0.0072 (0.0086) loss_rpn_box_reg: 0.0101 (0.0130) time: 0.6396 data: 0.0092 max mem: 6450 Epoch: [15] [ 50/209] eta: 0:01:43 lr: 0.000000 loss: 0.5703 (0.5907) loss_classifier: 0.0640 (0.0655) loss_box_reg: 0.0798 (0.0764) loss_mask: 0.4035 (0.4259) loss_objectness: 0.0062 (0.0098) loss_rpn_box_reg: 0.0109 (0.0131) time: 0.6363 data: 0.0088 max mem: 6450

i am training on custom data with 2(1class+background) class.

fmassa commented 4 years ago

This is probably an artifact of the logging, which only prints up to 6 digits. We do have a lr decay, and you probably set the lr decay by a too large factor.

Closing as this doesn't seem like a bug to me

vivekdeepquanty commented 4 years ago

You are right but problem is loss is not getting reduced even after 100 epoch. That's why i thought that learning rate become zero.

On Tue, Mar 24, 2020 at 6:40 PM Francisco Massa notifications@github.com wrote:

Closed #2007 https://github.com/pytorch/vision/issues/2007.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/pytorch/vision/issues/2007#event-3159581440, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANSRBNIPGJACRH7AO3IA3DDRJCWN3ANCNFSM4LSUA46A .

fmassa commented 4 years ago

there might be many reasons why the loss is not reducing, but without further information there is no way for us to help.

pytorch / vision

Learning rate become 0 #2007