drawbridge / keras-mmoe

A TensorFlow Keras implementation of "Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts" (KDD 2018)
MIT License
681 stars 217 forks source link

some questions about the trade off between the loss of two towers #4

Closed andrew-zzz closed 4 years ago

andrew-zzz commented 4 years ago

hi,Emin Orhan,it's so nice of you to share the beautiful code,i have some questions about the loss in synthetic_demo.py output_layer:[two towers] model.compile( loss={'y0': 'mean_squared_error', 'y1': 'mean_squared_error'}, optimizer=adam_optimizer, metrics=[metrics.mae] ) does the model fit the two tower separately? i mean the basic method of multiple task is loss = 0.5loss1 + 0.5 loss2.or does the procedure of this implemented by keras

alvin319 commented 4 years ago

Hi Andrew, thanks for reaching out! The loss is implemented by Keras and the model will fit both towers at the same time, the loss function should be y0_loss + y1_loss which both are MSE losses. Also for the loss function, you can assign coefficients to each part of the loss if you want, please check out Keras's implementation (loss_weights parameter in the compile method).