tstandley / taskgrouping

Code for Which Tasks Should Be Learned Together in Multi-task Learning?
Other
93 stars 14 forks source link

Is there any risk of producing underfit models? #8

Open shun-zheng opened 2 years ago

shun-zheng commented 2 years ago

Dear authors,

Thanks for your great work!

I am trying to replicate your results but encounter the problem that the training procedure encounters unexpected exits due to the following lines of codes.

https://github.com/tstandley/taskgrouping/blob/dc6c89c269021597d222860406fa0fb81b02a231/train_taskonomy.py#L431-L437

The training will stop if the learning rate is too low. While typically, we only conduct such an early stopping based on the validation metric instead of the learning-rate decay.

I am wondering whether there is high risk of producing underfit models.

Thanks,

tstandley commented 2 years ago

This code was added to quickly abort training broken or finished models.

During normal training, whenever the loss fails to improve, the learning rate is halved. But the model is only considered done when the proper number of epochs has elapsed. So the learning rate can get very low to where the model isn't learning anything (anymore).

Basically it's a way to ensure that we move on to the next model as soon as the learning has stalled.

If the args.minimum_learning_rate is set too high, this can result in underfitting. But I think the default is pretty sane. You can reduce it to zero if you want to turn that off.