jdtuck / fdasrvf_R

Functional Data Analysis using Square-Root Slope Framework
10 stars 13 forks source link

Karcher mean computation #34

Open araiari opened 6 months ago

araiari commented 6 months ago

While working with the package, @astamm and I noticed two possible incongruences with the reference paper (Tucker et al., 2013). Specifically,

  1. Missing normalization step In the two functions curve_karcher_mean() and multivariate_karcher_mean(), the computation of the Karcher mean never includes the normalization step of the mean. Currently, the output mean represents an element of the entire equivalence class of solutions and the inverse of the mean warping is not applied for the choice of the preferred mean in the equivalence class. We think that this part should be added to the code. Moreover, not from a theoretical point of view, but for the stability of the implementation, we argue that it is better to include such normalization step inside the for-loop of iterations: after the alignment of the SRVF and before mean computation, they should be aligned back though the inverse mean warping.

  2. Computation of betamean In the function kmeans_align(), when computing the templates for the case of L=1, i.e., univariate functions, the mean of fn is computed as the pointwise mean of the fn. However, it should be computed as the inverse SRVF of the mean of the qn.

If you agree about this, @astamm and I could submit a PR in an attempt to fix these issues.

jdtuck commented 6 months ago
  1. Assuming you mean normalization by picking an element of the mean orbit where the mean of $\gamma$ is $\gamma_{id}$. It should be apart of multivariate_karcher_mean(), but we as a research group have debated this on curve_karcher_mean(), it is still an orbit and we can pick a portion of the orbit, but have not seen the need to pick that element. I do not agree with the stability comment, please provide evidence here. This will increase computational time which I do not feel is worth what you are describing.
  2. Do not do this, this has numerical issues and this computation is done in this matter by finding the mean of fn then computing to SRVF afterwards. We have found through a lot of testing finding the mean of the SRVF and going to back to fn introduces a lot of integration error.

As usual please put in a PR.

astamm commented 6 months ago

Hi @jdtuck.

About the first point, I am not familiar with the vocabulary here (orbits and all) but in your 2013 paper, you do describe in Algorithm 1, fifth step, the recentering of the optimal warpings that have been found. This is what is currently missing in our opinion in the *_karcher_mean() functions.

About the second point, I understand numerical errors due to integration have to be handled but it is not the same thing to compute the pointwise mean in SRVF space or in original space. In particular, the pointwise mean is the Frechet mean associated to the L2 distance which is therefore appropriate in SRVF space. Hence, our comment. When you do pointwise mean in original space (fn), you use the Frechet mean associated to the L2 distance in a space of functions that are not guaranteed to be square-integrable in my opinion.

That being said, you are way more expert on the matter than we are, so I might be missing a point here. Thanks for taking the time to think on it with us.

jdtuck commented 6 months ago

First point, we can add it, but I will go back to what I said, should be added to multivariate, but not 100% sure on why its needed on curve. Its not a normalization either, that is an incorrect term. The mean is a family of solutions as the mean is invariant to random warping. So you are just picking an element of that family. For open and closed curves this is not usually needed for further analysis and any element will do. For functions it matters more, especially when looking at plots and comparing to the original data.

Second point, yes its a different metric, but the results is numerically very similar as the functions are already aligned. The integration is error is much larger than the difference in metric after alignment.

astamm commented 6 months ago

First point: good to me. I understand from your comment that curve_*() functions seem to be used by a different community than functions handling functions. Hence maybe the reason why you keep the curve world separate from the function world. If I am correct, then it becomes counterintuitive to use calc_shape_dist() which seems to be for curves for computing distances for functions with multivariate codomain and it might be preferable to have a dedicated function to compute distances between functions with multivariate codomain, just as you did for the Karcher mean. Note that this function could internally call calc_shape_dist() with the appropriate options.

Second point: I understand. Would it be possible to have a small reproducible example that quantifies that? I am interested in investigating more on this problem.

jdtuck commented 6 months ago

I can try to work up an example for you.

The problem is multivariate functional data depending on the type fall with in the curve domain so that is why any changes we need to figure out how to make it inclusive in the world of "shape"