In paper The gradients from the re- gression encoder are not propagated back to the heatmap- trained features (note the gradient-stopping connections in Figure 4). However, not see this operate in your train step.
Offset also mention in paper.
Look forward your reply.
Thank you for open this project.