Open exalate-issue-sync[bot] opened 1 year ago
Wendy commented: Here is the deal:
When standardize=true, the model will fit coefficients _beta with standardized numerical columns. However, global__beta will be coefficients derived from _beta to be used with non-standardized numerical columns. Hence, global_beta can be used to perform scoring without having to standardize the numerical predictors.
When standardize=false, the model will fit coefficients beta with non-standardized numerical columns. In this case, _global_beta will be the same as beta. However, if you are interested to see the values of the coefficients applied to standardized columns, you can call standardizedCoefficients implemented by our own Zuzana Olajcova. When a user wants to see the standardized coefficient, the following transformation will occur:
newbeta(1) = _beta(1)sigma1, newbeta(2) = _beta(2)sigma2, ….
newbeta(0) = _beta(0)+_beta(1)mean1+beta(2)mean2+…..
I need to port this into our documentation to make it clear.
JIRA Issue Migration Info
Jira Issue: PUBDEV-7245 Assignee: Wendy Reporter: Zuzana Olajcová State: Open Fix Version: N/A Attachments: N/A Development PRs: N/A
In building a GLM model, we can set the parameter standardize = true or false. However, it is not clear if the reported coefficients are those of the standardized coefficients or not. In addition, it is not clear if the documentation of those coefficients are correct either.
This confusion popped up during a discussion between Wendy and Zuzana.