Questions about joint training

backseason / PoolNet

Code for our CVPR 2019 paper "A Simple Pooling-Based Design for Real-Time Salient Object Detection"

MIT License

630 stars 153 forks source link

Questions about joint training #81

Closed IvanFei closed 3 years ago

IvanFei commented 3 years ago

hi, @backseason: I found that joint training used the schedule contained two forward and one backward. It seemed that nework would not correctly calculate the grad of edge_image loss. It means that the second forward would refresh network status from first forward, but the forward status is import for calucalte grad of network [ref chain rule as followed].

backseason commented 3 years ago

Hi,

During the second forward period, the newly calculated gradients will be accumulated with the gradients calculated in the first forward period instead of being refreshed, respectively.

IvanFei commented 3 years ago

hi, thank you for your reply. You are correct. I misunderstood the forward mechanism of PyTorch.
When you compute the forward pass using different inputs, each output will have its own computation graph attached to it in pytorch. It’s not overwritten by the next call.

best wishes