Spline Interpolation + R-Value CI = Absolute useless results.

plasmavisgroup / PlasmaGraph

System to graph PUPR Plasma data graph via GUI interface. Basically a tabulated grapher, except with a few nuances.

Other

1 stars 0 forks source link

Spline Interpolation + R-Value CI = Absolute useless results. #5

Closed Jellyfish-Red closed 10 years ago

Jellyfish-Red commented 10 years ago

As it says on the tin: Apparently, the Spline Interpolation test provides worthless R-values.

Jellyfish-Red commented 10 years ago

Found the reason why. R-Vaues, or, more specificallly, the Pearson product-moment correlation coefficient, measures linear correlation. In other words, when there's highly-curved data, the R-values provided are useless. In said cases, we need to use an alternate method to provide interpolation validity.

Jellyfish-Red commented 10 years ago

(Kind of) Confirmed: There are no really good general algorithms to calculate goodness of fit of any curve and provide a benchmark for comparison (nor for Confidence Intervals).

Changes will be as follows: 1) Pearson's R values will be provided for InterpolationType.LINEAR interpolations. 2) Goodness of Fit Sum of Squares Error values will be provided for any other InterpolationType interpolations.

Changes will be made on 22/2/2014.

Jellyfish-Red commented 10 years ago

Small note: Apache Commons Math's Spline Interpolations, by virtue of marking the original data set's values as knot points, should not require R-values or other verifying information, as they will automatically try to hit all the values.

Obviously, this might result in very awkward curves, but that's where finding outliers might work.

Jellyfish-Red commented 10 years ago

Fixed it on 3/10/2014:

1) Linear Regressions use the old Pearson's R format provided via Apache Commons Math and the DataConfidence class. 2) Polynomial (Quadratic, Cubic, Quartic) functions use the NEW "showRSquaredValidity (DataSet, PolynomialFunction2D)" function located currently in the Interpolator class. This class calculates the R-Squared value, which isn't automatically verifiable, but it's good enough. 3) Atypical functions of obtaining data interpolations use the "showRSquaredValidity (DataSet, PolynomialSplineFunction)" function currently located in the Interpolator class. This method calculates the R-Squared value as well. (See previous point for drawback of R-squared.)

As such, this issue is now fixed.