Closed vr140 closed 4 years ago
Hi! thanks for your contribution!, great first issue!
That's what the PyTorch autograd module handles itself. If during a forward pass a model or a branch of the model or a layer of the model is involved in calculating the final loss and is a parameter with requires_grad=True
, it will be updated during gradient descent. For weighted loss, weighted gradients will be calculated in the first step of backward propagation w.r.t to the final loss. Setting up a single loss value in the training_step is all you need.
Thanks! " For weighted loss, weighted gradients will be calculated in the first step of backward propagation w.r.t to the final loss."
Does this assume I use a WeightedLoss class? Instead of hand multiplying the weights myself ? ie What I'm wondering is how it knows to use the 0.2/0.8 weights for the branches of inception and then the 0.1/0.9 weights for Autoencoder vs Inception?
You don't have to use the WeightedLoss class. I think there isn't one. You can manually multiply it by some real number. To understand the second part you have to do some math manually by hand, can't explain it in chat though 😅. In a simple way take it as since you scaled your loss so during backprop this scaled loss will be used to calculate the gradients and eventually the gradients scale itself. When you add 2 or more losses, during backprop each of them receives the same gradient from the previous backprop step. To understand it in a better way, I suggest: https://youtu.be/i94OvYb6noo
Got it. Thanks! What if instead of setting my own weights for the losses I wanted to learn them instead? What would I need to change?
You can use a torch parameter for the weights (p and 1-p), but that would probably cause the network to lean towards one loss which defeats the purpose of using multiple losses.
If you want the weights to change during training you can have a scheduler to update the weight (increasing p with epoch/batch).
yeah, something like p = nn.Parameter(torch.tensor(0.5))
in your model init.
Thank you!
What is your question?
I'm trying replicate the model built in https://github.com/ekagra-ranjan/AE-CNN with Pytorch Lightning as a way to learn the framework.
They use an Auto-encoder along with a CNN (e.g. Inception V3), and this means there are multiple loss functions for each model, with a separate weight for each:
How would I ensure the losses are backpropagated correctly through: a) the different models (autoencoder and inceptionv3) b) the different branches of inceptionv3
Code
Here is my Lightning module code:
When I look in the original source code, I see they simply do:
which seems identical to what Pytorch Lightning would be doing. But again, what's not clear is how is this loss backpropagated correctly through:
a) the different models (autoencoder and inceptionv3) b) the different branches of inceptionv3
?
Should I configure multiple / weighted losses differently to ensure the correct losses are backpropagated to the respective models/branches of models? Or is setting up a single loss value in the training_step sufficient?
What's your environment?