Gradients of U with respect to F (feature map)

Hey @shahsohil can you clarify how could I use the DCC output (Z or U representatives) to get a gradient of some future loss function L(Z) with respect to my feature transform parameters F(X|theta)? (I'm using Z in the diagram to match notation in the paper, but I'm actually using U)

My current understanding of the data flow is summarized by the flowchart below. The dashed arrows are routes where the gradient can back propagate. The green boxes hold parameters that requires gradients in the pytorch sense. The red dashed line for dF/dX means the gradient theoretically exists but the current implementation does not allow for it. Gradient with respect to the feature transform means with respect to the parameters of the feature transform (d/dF means d/dtheta) Data flow - Page 3

After DCC I have representatives U that I then use in some later steps of the pipeline. I can get a gradient wrt U, but from the flowchart above there doesn't seem to be any way of propagating that back to F. The whole point of the pipeline is to learn the parameters for F, so the current architecture doesn't seem to work. One way to address this is to bring the later processes using U inside of the DCC loop as terms in the cost function. Do you have any ideas (and is my interpretation of the data flow wrong)?

shahsohil / DCC

Gradients of U with respect to F (feature map) #16