interval_time t in ODEFunc and zero conditional code in NODEBlocks

BailiangJ commented 1 year ago

Hi,

thank you for the excellent work.

I have two questions regarding the ODEFunc and NODEBlock/Warper codes.

The argument t in the forward function of ODEFunc is not used. But line 66 says "#Computed dynamics of point x at time t". Could you elaborate a bit on this?
Regarding the conditional code c in the Warper module. The Warpermodule has number_of_steps NODEBlock. In the first NODEBlock, ODEFunc is used. According to line 71, the conditional code is set to zero by ODEFunc. Does it mean only the first NODEBlock takes the conditional code as input, the remaining NODEBlock all take zero conditional code as input? If yes, could you explain a bit why?

Thanks a lot in advance.

Cheers

Siwensun commented 1 year ago

Our implementation is based on "stationary" velocity field so that velocity estimation has nothing with t. We have ever tried to encode t into estimation, but the difference in performance is not significant in our scenarios.
Zeroing out predictions means no updating in that parameters so that the latent codes are consistent across different timestamps. It is a hack for NeuralODE.

BailiangJ commented 1 year ago

Hi @Siwensun, thank you for your reply.

Please correct me if I am wrong: According to the paper, the quasi time-varying velocity field is made up with K stationary velocity fileds. And in the implementation K = 4, which means the final diffeomorphic flow is obtained by integrating the velocity field in 4 steps (4 NODE block). And for each step of integration, time t is not used (that's why it's 'stationary').
I understand that blocking the gradients from the integration to the latent code c by setting it to zero. My question is that: The Warper has e.g. 4 NODEBlock. In the for loop, the cxyz is updated as the output of the NODEBlock. The latent code c of cxyz is set to 0 in the ODEFunc used by NODEBlock, which means only the first NODEBlock will get non-zero latent code, the second, third and forth NODEBlock would all get 0 latent code. Is this intented? If it's the case, why would only the first NODEBlock takes non-zero latent code and the other remaining NODEBlock take 0 latent code?

Siwensun commented 1 year ago

Yes. you are correct.
The output of ODEFunc indicates how the values changes, so zero outputs means no changes after that iteration. Thus, latent code will remain the same before and after NODEBlock.

BailiangJ commented 1 year ago

Hi @Siwensun , thank you for your reply.

For question 2, I think I have figure it out. It's the feature of the function torchdiffeq.odeint.

Please correct me if I were wrong: Since odeint is solving the problem dy/dt = f(t, y) y(t_0) = y_0,

class ODEFunc is actually returning the derivatives dy/dt,

therefore by setting the dynamics of latent code to be zero,

we are not updating/changing (as you have explained) the latent code.

My previous incorrect understanding is that ODEFunc is updating/outputting the latent code directly.

Thanks a lot for the explanations.

Siwensun / Neural_Diffeomorphic_Flow--NDF