accord-net / framework

Machine learning, computer vision, statistics and general scientific computing for .NET
http://accord-framework.net
GNU Lesser General Public License v2.1
4.49k stars 1.99k forks source link

Procrustes analysis is giving weird/wrong results #723

Closed hohounk closed 7 years ago

hohounk commented 7 years ago

I have an application where I have to analyze data from sensors against reference data from the same sensors and in some cases, procrustes seems to break down and gives weird results. I've yet to see anything weird about the data I give it that could cause such issues.

This is an example of working case: good

Blue is the raw data coming from sensors. Red is the reference I compare against that is always constant. Green is a piece of data cut out* from the blue data that has passed through procrustes analysis.

*) I cut off a couple dozen samples from both ends of the blue line

Rough pseudocode of how I'm using it is like this:

double [][,] dataset;
dataset[0] = red_line;
dataset[1] = blue_cut; // Not visible on graph here but exact raw data is linked at bottom

analyser.Compute(0, dataset);
green_line = analyser.ProcrustedDatasets[1].Transform(analyser.ProcrustedDatasets[0]);

Here is an example of what happens when the bug gets triggered somehow. Notice how the green line is basically mirrored horizontally from center point: buggy

Does anyone know what could trigger such behavior and how to deal with it? I've been trying to go over the individual steps of the process with a fine tooth comb and so far I only know that the problem takes place in the Compute() function before the final Transform() step.

I put the input data of these two specific graphs + reference here in case someone wants to test it themselves: https://docs.google.com/spreadsheets/d/1A-C2mm7lV_zJy-i-ORaewjkvcVyjr4ubSMyHj8REHOs/edit?usp=sharing

Sadly, I do not have a simple test application I could share that would show the problem but it should be relatively trivial to create.

cesarsouza commented 7 years ago

Hi @hohounk,

Thanks a lot for opening this issue! However, even though you mentioned you do not have a sample test application you could share to reproduce the issue, is there any chance that you could actually provide some raw data (which does not have to be your own personal/company/proprietary data, but could also be a crafted example) that could actually reproduce the issue?

The point is that the Procrustes Analysis have originated from a user contribution, so it might take a while for me (or maybe even the original author, @remyzerems) to understand what is causing the issue and eventually come up with a fix. However, I think that having a test case showing the problem would make things much easier for us.

@remyzerems, do you have any idea what might be causing this issue? Is there anything in the original code that might cause the Procrustes Analysis to reverse slopes like this?

Regards, Cesar

remyzerems commented 7 years ago

Hi !

First, I'm glad to see this contribution is helpful in some way...

Well the issue you are facing seems to come from the rotation step. As you may know, Procrustes goes through three main steps which are centroid offset removal, scale normalization and finally rotation matching to fit to the reference. Since your results show what seems to be a mirror effect or a wrong rotation angle, I think the problem comes from this rotation step.

I managed to use the original Procrustes code I had and tried your datasets with (thanks it's very helpful by the way !). I was able to reproduce what you encounter. I looked into the Rotate function (line 224 of ProcustesAnalysis.cs) and after a while looking at the Amy Ross paper, I found the algebraic operation at line 230 seems to be the wrong one...

So, as a fix, I think that the correct way to do it would better be : double[,] Q = svd.RightSingularVectors.DotWithTransposed(svd.LeftSingularVectors);

@cesarsouza maybe you could quickly give an eye on this and confirm the mistake and its fix, I mean from the pure Math point of view (Amy Ross's paper is very helpful, paragraph 3.3 Rotation) ? For multiple reasons, I cannot spend much time right now on this issue, but I'm quite confident with this solution. If you do agree with the fix Cesar, could you make the change for me please ? Thank you. Otherwise, maybe I'll be able to further dig on this later on within a couple of weeks...

In parallel, @hohounk, maybe you could "extract" the Procrustes classes from Accord, make the change I mentionned and add these as part of your project as a temporary workaround. Moreover, as you seem to have multiple variable datasets from a real world application, this could really help in the process.

Regards, Remy

cesarsouza commented 7 years ago

Hi @remyzerems,

Thanks a lot for the input! Me too I am a bit short on time to take a look at this issue since it would involve getting through lots of literature (at least for me, right now). But I will include this change in the next pre-release version of the framework which hopefully @hohounk could be able to try.

Again, thanks a lot for coming back at this issue! I hope the framework can continue being helpful for you both!

Regards, Cesar

cesarsouza commented 7 years ago

Hi @hohounk,

I've just uploaded a new pre-release package to NuGet (3.6.4-alpha) which should contain fix mentioned above. Please let me know if you still have issues with this new version!

Hope it helps!

Regards, Cesar

cesarsouza commented 7 years ago

Changes have been incorporated in release v3.7.0.