Closed XiaoZhang-NN closed 2 years ago
(1) Section 3.2 describes the general SDE equation. Section 3.3 describes the design of the SDE-Net. We design the brownian motion term in SDE-Net dependent only on x_0 (This is clear in Equation 4 and all our later analyses are based on this formulation) because it is easier for optimization. Otherwise you need to optimize every time point.
Even if the bownian motion term is a constant \sigma that doesn't depend on any input (an even simpler case than our formulation), you need to use the discretization method to approximate the final solution since there is no colsed-form solution.
Also, I am sorry but I cannot understand why the theoretical analysis is meaning less if we use g(x_0) as the bownian motion term. Note that even if the bownian motion term is g(x_0) (this is not a constant but a function versus x_0 and x_0 is your input variable), you still need to intergrate it with respect to the time t (Also, this is stochastic integral beacuase of the brownian motion term instead of the traditional integral). I am not sure what do you mean by only the drift term. To make the solution unique for every x_0 \in R^n, you need to make both the functions f(x_t,t) and g(x_0) (this is function with respect to x_0) Lipschitz continuous. This is stated clearly in equation 6.
I guess your misunderstanding is beacuase you are unfamiliar with stochastic differential equation. Maybe the tuturial on SDE can help you. https://www.pims.math.ca/files/lec1_tw.pdf
(2) The two graphs are simulated using the geometric SDE by changing the variance of the brownian motion.
Since the question is based on the mis-understanding of the paper, I close the this issue.
(1) Why does g(t,x_t) in the sde equation become a constant (x_0) in the algorithm and code, and the code is also g(x_0) during iteration, according to the use of euler-maruyama mentioned in your paper The format should not be g(x_0), and the variable g(x_k) should also be inside the loop. At the same time, the theoretical analysis formula (5) is meaningless, only the drift term. (2) Do the two graphs at the back of Figure 3 have corresponding codes?