Open fkiraly opened 6 months ago
What are your thoughts on this @fkiraly ?
Excellent suggestion, in my opinion!
Yes, the normal assumption has bothered me for a while, but there haven't been too good alternatives before the various empiricals had been implemented.
One question of course is, one would need to choose some arbitrary quantile points if we would be using empirical QPD. Perhaps, all the percentiles?
Further, a problem could be lack of smoothness, which have the risk of suddenly breaking user workflows that involve losses assuming continuous distributions, this might be a major issue to finish discussion on before doing sth too quickly.
Some options I can think of, besides making this the default overall:
_predict_proba
, e.g., EmpiricalPredictProba(my_regr)
set_config
that allows to set the type of default being used
Point raised by @Ram0nB in https://github.com/sktime/skpro/pull/236, about using a more meaningful default for
_predict_proba
if not available. Relevant forskpro
, but also the same logic insktime
:Original comment
One thing that comes to mind is whether we also want to add the logic of converting a quantile prediction to a distribution estimate to the
BaseProbaRegressor
. Currently, the implementation ofBaseProbaRegressor
's_predict_proba
uses the var and mean prediction to return a normal distribution.Maybe we can enhance
BaseProbaRegressor
's_predict_proba
such that it uses theQPD_Empirical
if_predict_quantiles
/_predict_interval
are available, and else the current logic. This way, we don't assume a normal distribution if multiple quantiles are available. What are your thoughts on this @fkiraly ?