Closed stanfordstrickland closed 2 years ago
In your reference formula, k is the number of predictors excluding bias term. In our implementation, it includes the bias term. So they are the same indeed.
Perhaps I'm missing something. In LinearModel
this.adjustedRSquared = 1.0D - (1.0D - this.RSquared) * (double)(n - 1) / (double)(n - this.p); this.p = X.ncols();
I was under the impression that the X matrix is not padded with 1s to include the intercept. Rather the intercept is passed as the parameter b. Here is the constructor public LinearModel(Formula formula, StructType schema, Matrix X, double[] y, double[] w, double b). Therefore p in this instance will not include the intercept as it will only be the count of the X columns.
Might be worth pointing out that this is in the context of using RidgeRegression.
Describe the bug Adjusted R Squared formula seems incorrect.
Expected behavior he adjusted r squared is defined as adj_r_2 = 1 - (1-r2)*(n-1)/(n-k-1) where k are the number of predictors (excluding the intercept).
Actual behavior he adjusted r squared is defined as adj_r_2 = 1 - (1-r2)*(n-1)/(n-k) where k are the number of predictors (excluding the intercept).
Code snippet In the LinearModel class
this.adjustedRSquared = 1.0D - (1.0D - this.RSquared) * (double)(n - 1) / (double)(n - this.p);
where
_this.p = X.ncols();
Additional context Perhaps its also worth noting that if an intercept is not included the formula for adjusted R squared is
adj_r_2 = 1−n*(1−r2)/(n−k)
where k are the number of predictors.