Closed Jellyfish-Red closed 10 years ago
Found the reason why. R-Vaues, or, more specificallly, the Pearson product-moment correlation coefficient, measures linear correlation. In other words, when there's highly-curved data, the R-values provided are useless. In said cases, we need to use an alternate method to provide interpolation validity.
(Kind of) Confirmed: There are no really good general algorithms to calculate goodness of fit of any curve and provide a benchmark for comparison (nor for Confidence Intervals).
Changes will be as follows: 1) Pearson's R values will be provided for InterpolationType.LINEAR interpolations. 2) Goodness of Fit Sum of Squares Error values will be provided for any other InterpolationType interpolations.
Changes will be made on 22/2/2014.
Small note: Apache Commons Math's Spline Interpolations, by virtue of marking the original data set's values as knot points, should not require R-values or other verifying information, as they will automatically try to hit all the values.
Obviously, this might result in very awkward curves, but that's where finding outliers might work.
Fixed it on 3/10/2014:
1) Linear Regressions use the old Pearson's R format provided via Apache Commons Math and the DataConfidence class. 2) Polynomial (Quadratic, Cubic, Quartic) functions use the NEW "showRSquaredValidity (DataSet, PolynomialFunction2D)" function located currently in the Interpolator class. This class calculates the R-Squared value, which isn't automatically verifiable, but it's good enough. 3) Atypical functions of obtaining data interpolations use the "showRSquaredValidity (DataSet, PolynomialSplineFunction)" function currently located in the Interpolator class. This method calculates the R-Squared value as well. (See previous point for drawback of R-squared.)
As such, this issue is now fixed.
As it says on the tin: Apparently, the Spline Interpolation test provides worthless R-values.