question about Energy function

Lwy-1998 commented 2 years ago

Hello, Your code is amazing! When I go deeper with the energy function part, I have trouble in understanding why on_diagonal_weights = -2 combined_adaptiveregularization_weight and off_diagnoal_weights = 1+ 2 combined_adaptiveregularization_weight. Could you please explain it ？ Thank You！

how4rd commented 2 years ago

Thank you!

MeshFlow moves the mesh nodes by minimizing the energy function. For reference, that energy function is listed in MeshFlow: Minimum Latency Online Video Stabilization by S. Liu et al. The function is equation (1) on p. 806.

MeshFlow minimizes the energy function numerically using a solver based on the Jacobi method. The MeshFlow paper doesn't describe the solver, but the solver appears in another paper by the same author, Bundled Camera Paths for Video Stabilization by S. Liu et al. The solver is equation (6) of p. 5.

The lines you're referring to compute the coefficients of the solver. The matrices containing those coefficients, on_diagonal_weights and off_diagonal_weights, eventually get passed into _get_jacobi_method_output to minimize the energy function.

(I must admit I am not sure how to derive the solver equation! I tried deriving it myself by setting partial derivatives to 0, but some coefficients in my result didn't match the ones from the second paper. My equation didn't work in practice, but the one from the paper did, so I used the latter. I checked my derivation several times, so I suspect I misunderstood something.)

Please let me know if this explanation helps!

Lwy-1998 commented 2 years ago

:) Thanks for your patient reply, and it really help！

I have reviewed Bundled Camera Paths for Video Stabilizationby S. Liu et al. before. I think I have understand what the author try to say and the whole process of minimizing the energy function by Jacobi method .but I just can't derive the solver equation myself . :( (if possible, could you please tell me your idea on how to deriving this equation?)

I asked my supervisor for help and he said that this is LM algorithm, and the deriving process is very complicated .He code for this type of algorithm during his Ph.D and it runs extremely slow(comparing with Matlab or C++ library ).

I have another question for extracting the Variablevertex_unstabilized_displacement. I checked the displacement for every mesh cell(grid). I found that the displacement can be very large(400~500 pixel)sometimes(Maybe I misunderstood your code), but the output video is very stable.

I implement the source codein VS2019(C++). and extract the Mesh vertex displacement the unstable displacement is very subtle comparing with yours.So maybe your definition for displacement is different from the author's ?

Thanks for your reply again

how4rd commented 2 years ago

Oh interesting. I hadn't heard of the LM algorithm before. Thanks for sharing that!

Here is how my (incorrect) derivation worked:

The energy function in the paper is a function of $\mathbf{P}(t)$, which is the vector containing the cumulative displacements of all the mesh nodes at time $t$. If we say there are $n$ nodes in the mesh, then $\mathbf{P}(t)$ has $2n$ components (a component for each node's $x$- and $y$- displacements), and the energy function also has $2n$ components. The $2n$ components of the energy function are independent of each other. Thus we can optimize each of the energy function's $2n$ components separately.

Consider the $i^{th}$ component of the energy function. That component is a function of the $i^{th}$ component of $\mathbf{P}(t)$, which we can call $p{i}(t)$. I wanted to find the value of $p{i}(t)$ that minimizes the $i^{th}$ component of the energy function. To do that, I differentiated the $i^{th}$ component of the energy function with respect to $p{i}(t)$. I set that derivative to $0$ and then solved for $p{i}(t)$. Unfortunately, that result didn't match the paper, and it didn't work in practice either.

As for vertex_unstabilized_displacement, if my implementation produces smoother video than the C++ implementation, then it's probably because my implementation uses different adaptive weights (different $\lambda{t}$ terms in the energy function). By default, my implementation sets $\lambda{t} = 100$, so it warps the video quite aggressively and produces a smooth video with distortion artifacts. The original paper sets $\lambda{t}$ in a more complicated way involving eigenvalues. To try my implementation with the paper's $\lambda{t}$ values, you can run stabilizer.stabilize with adaptive_weights_definition set to ADAPTIVE_WEIGHTS_DEFINITION_ORIGINAL instead of ADAPTIVE_WEIGHTS_DEFINITION_CONSTANT_HIGH. If you're curious, there's some more info here.

Lwy-1998 commented 2 years ago

Thanks for sharing your idea.

I consider $\mathbf{P}(r)$ as a constant ,so the final result I derived is $\mathbf{P}(t) = 1/ \gamma C(t) + (\lambda_{t} \sum{w{t,r}})/\gamma*P(r)$ and $\gamma = 1+\lambda{t} *\sum{w_{t,r}}$ (which is missing coefficient 2 compared with the paper).

If I consider $\mathbf{P}(r)$ related to $\mathbf{P}(t)$ , I don't know how to get rid of $\mathbf{P'}(r)$....

If you have any suggestions pls let me know :)

how4rd commented 2 years ago

I'm afraid I don't have more suggestions offhand. :/ I don't have my old work to check against, but I remember my result was missing a 2 somewhere like yours.

Lwy-1998 commented 2 years ago

Haha.. Anyway,Thank you!

how4rd / meshflow

question about Energy function #2