About fixed weights from a grid search experiments setting

SimonVandenhende / Multi-Task-Learning-PyTorch

PyTorch implementation of multi-task learning architectures, incl. MTI-Net (ECCV2020).

Other

752 stars 114 forks source link

About fixed weights from a grid search experiments setting #12

Closed ganlumomo closed 3 years ago

ganlumomo commented 3 years ago

Hi, I am very impressed by your survey on MTL, from which I have learned a lot. I am currently working on a MTL project, so I am very curious about the grid search experiments for the fixed weights. I have not found details about this in your paper as well as this repo. Could you give me more information on this? What exactly are those grid search weighs? And you used all the combinations of those weights to train the MTL network and evaluate it? If I want to find the best weights for my MTL network, I need to do the same experiments? Could you give me some suggestions on this? Thank you so much!

SimonVandenhende commented 3 years ago

Hi. Thanks for the interest. If you want to find a quick way of setting the weights. I would suggest to track the task-specific losses over a few iterations/epochs for every task. Then adjust the weights on the tasks a bit, so all of the losses are of similar magnitude. For example, if we solve two tasks, with weights 1.0, and the average loss of task 1 is 100 times the average loss of task 2. Then I would train with a weight * 100 for the second task. Also, if you use a multi-task baseline, I find that Adam can sometimes achieve a bit higher numbers compared to SGD.

Ofcourse, you might get even better results if you further try to adjust the weights through trial-and-error. However, I find that the proposed method works well in practice.

Good luck.

ganlumomo commented 3 years ago

Hi @SimonVandenhende,

Thank you so much for the suggestions. For the Adam optimizer, does amsgrad parameter need to be set to True?

Best.

SimonVandenhende commented 3 years ago

Hi @ganlumomo

I only explored with the regular Adam optimizer and SGD (see the get_optimizer function in utils/common_config.py). It should be possible to get good results when combining these optimizers with properly initialized weights (following the procedure outlined above).