In Tutorial 8: Deep Energy-Based Generative Models, part of the CD is $𝔼{q\theta (x)}(\nabla\theta E\theta (x))$, but what the code is estimating seems to be $\nabla\theta(𝔼{q\theta (x)}( E\theta (x)))$ as shown in the screenshot
am I wrong or this doesn't make a huge difference when it comes to gradient estimation
In Tutorial 8: Deep Energy-Based Generative Models, part of the CD is $𝔼{q\theta (x)}(\nabla\theta E\theta (x))$, but what the code is estimating seems to be $\nabla\theta(𝔼{q\theta (x)}( E\theta (x)))$ as shown in the screenshot
am I wrong or this doesn't make a huge difference when it comes to gradient estimation