Question on relative weights

Thanks for the great research and code sharing. After reading the paper and using it in my research, I got a question. There are two styles for the implementation of weighted loss. Case 1) L = w_a L_a + w_b L_b + w_c L_c Case 2) L = L_a + w_b L_b + w_c * L_c In case 2, the weight of a loss L_a is set to 1. In my humble opinion, I guess that w_b and w_c will be learned with relative log_vars values accordingly. In your paper or code, on the other hand, all weights, i.e., all log_vars are set to learnable as in Case 1. Is there any intention to prefer Case 1? Could it be a problem if I use the style of Case 2?

yaringal / multi-task-learning-example

Question on relative weights #8