Closed mymuli closed 2 years ago
论文里面说,学习率在第24个epoch和第30个epoch分别除以10,如下: The paper says that the learning rate is divided by 10 in the 24th epoch and the 30th epoch, as follows: 但是在你的代码里面,epoch >= 24和epoch >= 30,学习率分别乘0.1,这样导致epoch >= 24后,所有epoch全部乘0.1,这样学习率将会很低,我个人认为,这是一个代码的错误,应该改为epoch ==24和epoch == 30 But in your code, when epoch > = 24 and epoch > = 30, the learning rate is multiplied by 0.1, which results in that when epoch > = 24, all epochs are multiplied by 0.1, so the learning rate will be very low. I personally think this is a code error, and should be changed to epoch = = 24 and epoch = = 30. https://github.com/allenai/elastic/blob/57345c600c63fbde163c41929d6d6dd894d408ce/multilabel_classify.py#L387 https://github.com/allenai/elastic/blob/57345c600c63fbde163c41929d6d6dd894d408ce/multilabel_classify.py#L389
sorry, I made a mistake. I didn't notice that lr = args.lr... The learning rate was 0.0001 between epochs 24 and 29, and 0.00001 between epochs 30 and 36...
论文里面说,学习率在第24个epoch和第30个epoch分别除以10,如下: The paper says that the learning rate is divided by 10 in the 24th epoch and the 30th epoch, as follows: 但是在你的代码里面,epoch >= 24和epoch >= 30,学习率分别乘0.1,这样导致epoch >= 24后,所有epoch全部乘0.1,这样学习率将会很低,我个人认为,这是一个代码的错误,应该改为epoch ==24和epoch == 30 But in your code, when epoch > = 24 and epoch > = 30, the learning rate is multiplied by 0.1, which results in that when epoch > = 24, all epochs are multiplied by 0.1, so the learning rate will be very low. I personally think this is a code error, and should be changed to epoch = = 24 and epoch = = 30. https://github.com/allenai/elastic/blob/57345c600c63fbde163c41929d6d6dd894d408ce/multilabel_classify.py#L387 https://github.com/allenai/elastic/blob/57345c600c63fbde163c41929d6d6dd894d408ce/multilabel_classify.py#L389