Open ASevastkms opened 3 years ago
1) Yes that's correct, regressor_coefficients()
un-standardizes the coefficients.
2) Just checking that what you've pasted isn't the value of the extra regressors for the prediction period? (i.e. you'd have columns A, B, C, D with their actual values as well as inputs to the prediction).
If not, then yes they should be the contributions of those regressors to yhat
. The contribution can be negative if the value of A is below the "center" -- the contribution to the prediction is coefficient_standardized * (A_value - center) / std
Hi @tcuongd I have been trying to recreate the calculated values of each variables contribution from the above formula you stated but unfortunately, I am unable to do so. I am unclear on whether I am missing any steps along the process but I would very much appreciate your help on this before by deadline to explain it to the stakeholders.
I would like to give examples just for the simplicity so we can follow the steps.
The variable I have lets say is X - (I have quite a lot of them) and the output of the regressor_coefficients() as follows:
regressor | regressor_mode | center | coef_lower | coef | coef_upper X | additive | 409262.904434 | 0.000059 | 0.000059 | 0.000059
The above coefficient is the un-standardized value of that variables coefficient which we do not use to calculate its contribution?
Then from the mentioned above method: m.train_component_cols.T.dot (np.array (m.params ['beta']) [0]) I get the following gives me X = 0.003110 which is the unstandardized coefficient for the X variable, is that correct?
Then I look at the df of prediction outputs. On the first day (t) I see the value of X as -24.332580 and the next day (t+1) 124.304026.
In the actual data frame, the raw value I have for X in first day (t) is 0 and the second day (t+1) is 2500000.0.
And from the extra regressors dictionary out of the model, I extract the following information:
('X', {'prior_scale': 5.0, 'standardize': 'auto', 'mu': 409262.9044342508, 'std': 840476.036516692, 'mode': 'additive'})
So as for the contribution of the X on the first day (t) in the model, it should be:
0.003110 * ((0 - 409262.9044342508)/840476.036516692)
and for the next day (t+1): 0.003110 * ((2500000.0 - 409262.9044342508)/840476.036516692)
but these calculations are not giving me anywhere near numbers of -24.332580 or 124.304026. Would you be able to, please, point out the steps that I am missing in calculation the contribution of each regressor to the yhat on a given day?
I am so hoping that you would see my question sometime soon..
Hi @elif-tr , did you solve this case? I am having the same issue with this part and trying to find the solution.
Good day!
I have built a model with Profet and the prediction results are very good. But in my model there are 4 additional regressors (for simplicity - A, B, C, D) and I need to get their coefficients and how much the forecast result depends on each of them (contributions of the regressors). I am using regressor_coefficients () function and I get:
regressor | regressor_mode | center | coef_lower | coef | coef_upper A | additive | 1581.334616 | 0.000153 | 0.000153 | 0.000153 B | additive | 655.913061 | -0.003166 | -0.003166 | -0.003166 C | additive | 0.000000 | 0.000000 | 0.000000 | 0.000000 D | additive | 0.000000 | 0.000000 | 0.000000 | 0.000000
And when using m.train_component_cols.T.dot (np.array (m.params ['beta']) [0]) I get the following:
component A 0.000248 B -0.002786 C 0.000000 D 0.000000
Do I understand correctly that in the second case the coefficients are standardized, but in the first case they are no longer?
When I do forecast = m.predict (future) I get:
A | A_lower | A_upper | B | B_lower | B_upper -0,24199 | -0,24199 | -0,24199 | 2,076534 | 2,076534 | 2,076534
What is it? Are these the contributions of the regressors to the result? Then why the regressor A with a positive coefficient gives a negative contribution, I do not understand, tell me, please. Sorry for bad english)