Model Performance Statistics Analysis

Data significant deviation from normality is obtained using the rank based analysis of linear models (ref 2.4. Statistics )

comparing the performance of multiple models based on their rankings, rather than their absolute scores or metrics.

Rank based analysis of linear models

Rank-based analysis of linear models is a method for comparing the performance of multiple models based on their rankings, rather than their absolute scores or metrics. This type of analysis can be particularly useful when the scale or unit of the performance metric might vary across models or when the absolute performance metric might be difficult to interpret.

Steps :

Ranking Models:

For each model, calculate a performance metric (like Mean Squared Error for regression or Accuracy for classification). Then, rank the models based on this metric. The best-performing model gets a rank of 1, the next best gets a rank of 2, and so on.

Comparing Ranks:

Instead of directly comparing the performance metrics of the models, compare their ranks. For instance, if three models A, B, and C have ranks 1, 2, and 3 respectively, you would prefer model A over B and C.

Least squares (RAOV) (Kloke and McKean, 2012)

"Rank Analysis of Variance," and it represents a non-parametric alternative to the traditional Analysis of Variance (ANOVA) method. The Rank Analysis of Variance (RAOV) is used when the assumptions of the traditional ANOVA (like normality) are not met. Instead of using the raw data, the RAOV uses the ranks of the data.

The idea behind RAOV is similar to the rationale behind the Wilcoxon rank-sum test or the Kruskal-Wallis test. These are rank-based methods that provide non-parametric alternatives to the t-test and one-way ANOVA, respectively.

"Rank Analysis of Variance Using Least Squares" (often abbreviated as RAOV-LS) is a specific method that combines the rank transformation approach of RAOV with the least squares estimation method commonly used in linear models.

Here's a basic outline of how it works:

Rank Transformation: Convert the raw data into ranks. For example, the smallest value gets a rank of 1, the next smallest gets a rank of 2, and so on.

Least Squares Estimation: Fit a linear model to the ranked data using least squares estimation, just as you would with traditional ANOVA.

Hypothesis Testing: Test the significance of factors in the model using standard methods. However, since you're working with ranks, the interpretation is in terms of median differences rather than mean differences.

Wald test

Wald test is a popular statistical test used in the context of regression analysis and econometrics to test the significance of individual coefficients in the model. It's used to determine whether a particular explanatory variable has a significant effect on the dependent variable, given the presence of other variables in the model.

mralioo / BBCPy_DNN