Closed jannschu closed 4 years ago
I will review the line fitting test case sometime today or tomorrow. I am a bit busy atm, but I definitely will prioritize this.
I also did change the API slightly in master to accommodate optimization I am doing with lie algebras on rotation matrices. I don't think it will interfere with your code, but can you pull that in to this branch.
Hi all, thanks for sharing this code. I also want to use the algorithm and get here.
It depends on how residual is calculated. Let Y denote measured values, Ŷ denote estimated values.
If you define ε = Y - Ŷ
, then no need to add the minus sign.
I prefer this definition, which is also used on Wikipedia.
@J-F-Liu You may want to wait a bit to use this as @jannschu is working on a branch that refactors it. See #4.
Hi all, thanks for sharing this code. I also want to use the algorithm and get here. It depends on how residual is calculated. Let Y denote measured values, Ŷ denote estimated values. If you define
ε = Y - Ŷ
, then no need to add the minus sign. I prefer this definition, which is also used on Wikipedia.
If your residual is of this form and you flip the sign, your Jacobian also flips sign. So this is not correct, the sign was wrong, I think.
edit: Should clarify that the Jacobian has to be computed with respect to the residual vector. Not a measurement.
@jannschu If the jacobian is negative and you flip the residuals to be negative then it should cancel out, unless I am mistaken.
Jacobian doesn't relate to residual. From my understanding,
residual = target - model(params)
jacobian = partial_diff(model, params)
@J-F-Liu Each term (i,j) of the Jacobian is of the form partial diff of residual i with regard to variable j. So yes the sign of the Jacobian depends on how you compute your residual. Usually we model our problem as a "data fitting problem" (look for "In data fitting" in https://en.wikipedia.org/wiki/Gauss%E2%80%93Newton_algorithm). And in that case, we are computing the partial derivatives of (data - model). Therefore if you compute the "Jacobian of the model function" and not the "Jacobian of the residuals: (data_i - model_i)" the "model" appears with a negative sign. In that setup, if you still call the "gradient" the value "J^T r" then the guess is update with `guess += hessian gradientinstead of
-=`.
If I read the code correctly, the update guess function is left to the implementation. So the implementation functions of apply_delta
, residuals
, and jacobians
have to be coherent with each other. In any case, accumulating gradients should just be grad += J_i^T * r_i
there should not be any minus sign here in my opinion so I don't really agree with the change. I suppose the issue here is more regarding naming and documentation of what those are so that implementations of the 3 functions are correct with regard to each other. But I'll let you @vadixidav and @J-F-Liu decide how to proceed.
I see, in case of computing the partial derivatives of (data - model), we can think of target values are 0, and either (data - model) or (model-data) as a new_model, so that residual = target - new_model(params), jacobian = partial_diff(new_model, params)
and guess += hessian * gradient
still holds, thus simplifying API definition.
@mpizenberg I am letting @jannschu decide which form we want. I think we are going with the more traditional form (the one stated on the Wikipedia page), which is fine. @jannschu is doing a huge rewrite, so this PR is mostly irrelevant anyways.
Wrong sign was used for new parameters.
Fixes the issue described in #4.
edit: I pushed a possible fix for the line fitting test case. You might want to check my derivatives, just to be sure.