Scaling questions in multi-exit loss configurations.

seominseok0429 commented 4 years ago

hello. l am a college student studying deep learning in Korea.

i read your paper impressed.

i was wondering while reading the paper. did you scale each loss (depending on network depth) when configuring a multi-exit loss?

for example, ( depth = exit1 < exit2 < exit3)

exit1 , exit2, exit3 = model(input)

loss1 = criterion(exit1, targets) loss2 = criterion(exit2, targets) loss3 = criterion(exit3, targets)

no scale

total_loss = loss1 + loss2 + loss3

scale

total_loss = 0.1loss1 + 0.2loss2 + 0.7*loss3

If you didn't scale, can you tell me why?

i wanted to solve it by myself, but i can't solve it. i'm really sorry.

best regards

mary-phuong commented 4 years ago

Hi. No, I didn't scale the losses, but I think in principle you should scale the losses based on the relative importance of the exits (more important exits should get higher weights). The reason I don't scale is that in my experiments, there is no such prior importance information.

On Wed, 12 Feb 2020 at 10:32, seominseok notifications@github.com wrote:

hello. l am a college student studying deep learning in Korea.

i read your paper impressed.

i was wondering while reading the paper. did you scale each loss (depending on network depth) when configuring a multi-exit loss?

for example, ( depth = exit1 < exit2 < exit3)

exit1 , exit2, exit3 = model(input)

loss1 = criterion(exit1, targets) loss2 = criterion(exit2, targets) loss3 = criterion(exit3, targets)

no scale

total_loss = loss1 + loss2 + loss3

scale

total_loss = 0.1loss1 + 0.2loss2 + 0.7*loss3

If you didn't scale, can you tell me why?

i wanted to solve it by myself, but i can't solve it. i'm really sorry.

best regards

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/mary-phuong/multiexit-distillation/issues/3?email_source=notifications&email_token=ACXSSZNSGI4DB2QIA53HIZTRCO63XA5CNFSM4KTV7ROKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4IM36AVA, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACXSSZJZJGBYPC2GN6VJET3RCO63XANCNFSM4KTV7ROA .

seominseok0429 commented 4 years ago

Thank you very much for your kind reply.

mary-phuong / multiexit-distillation

Scaling questions in multi-exit loss configurations. #3