Open lorentzenchr opened 7 months ago
Hi @lorentzenchr, I would like to contribute to this issue.
- Mean absolute percentage error (MAPE) is used quite a lot. I propose to replace it, in particular if predicting/forecasting the mean value. Note that MAPE is optimized by the median of a distribution with pdf propotional to $\frac{f(y)}{y}$, where $f(y)$ is the pdf of the true distribution of the data.
With respect to this, what alternate evaluation metric would you recommend replacing MAPE with? Also, do you suggest that we should replace MAPE with another metric within the scoring
dictionary below?
- The
pinball_loss_50
is the same as1/2 MAE
, this redundancy could be removed.- A residual vs predicted does note really make sense for 5%- and 95%-quantile prediction. A reliability diagram for quantiles might be a good replacement, see model-diagnostics plot_reliability_diagram. Note that this is not possible within current scikit-learn. Maybe the best next action is to add a little more explanation to the graphs.
With regards to these suggestions, I will work on incorporating the required changes to the example.
A few improvements could be made on the new example of #25350:
Mean absolute percentage error (MAPE) is used quite a lot. I propose to replace it, in particular if predicting/forecasting the mean value. Note that MAPE is optimized by the median of a distribution with pdf propotional to $\frac{f(y)}{y}$, where $f(y)$ is the pdf of the true distribution of the data.
The
pinball_loss_50
is the same as1/2 MAE
, this redundancy could be removed.A residual vs predicted does note really make sense for 5%- and 95%-quantile prediction. A reliability diagram for quantiles might be a good replacement, see model-diagnostics plot_reliability_diagram. Note that this is not possible within current scikit-learn. Maybe the best next action is to add a little more explanation to the graphs.