WongKinYiu / PartialResidualNetworks

partial residual networks
100 stars 24 forks source link

questions for gradient time stamp and source #18

Open jacksonsc007 opened 3 weeks ago

jacksonsc007 commented 3 weeks ago

I am recently working on the series of work of PRN, CSPNet and ELAN. They are tightly related but I find it quite esoteric to grasp the insights of this series of work. I searched the entire community but did not help. Here is a (fundamental) question I have:

  1. the concept of gradient timestamp and source in PRN. The definite concept of them does not show up. My question is: Why should we care about the timestamp of gradient? During each iteration, all we want is the final gradient with respect to a learnable parameter. If the goal of PRN is to increase the diversity of gradient combination, maybe a formula which incorporates the explicit form of gradient is much more lucid and reader-friendly?

@WongKinYiu