Closed fkiraly closed 7 months ago
Hey @fkiraly happy to see you here! Give me some time to take a look at this one as I'm not super familiar with that class of forecasters from sktime.
The main difference is that the distribution does not need to be estimated via KDE, you already have it in a form where you can access pdf, cdf, etc, completely, and you have the quantile function too which helps with selecting x-axis range.
If I understood correctly, this is completely fine when it comes to interacting with ridgeplot, since we also accept x-y traces as input -- bypassing the KDE step altogether
as I'm not super familiar with that class of forecasters from sktime.
Most popular forecasters have a probabilistic prediction mode, and they can be filtered by the tag capability:pred_int
.
Check it out in the forecasting tutorial, the main tutorial of skpro
also explains the tabular (non-time-series) interfaces arorund it.
If I understood correctly, this is completely fine when it comes to interacting with ridgeplot, since we also accept x-y traces as input -- bypassing the KDE step altogether
Yes - in the conceptual space of sktime
/ skpro
, the KDE step is an estimator, of type "distribution from sample". In the desired plot, we would go directly from distribution to plot, without th first step of going to distribution from sample.
For a while I have now been thinking about what a good plotting modality would be for fully distributional predictions, i.e., the output of
predict_proba
insktime
orskpro
.The challnge is that you have a (marginal) distribution for each entry in a
pandas
-like table, which seems hard to visualize. I've experimented with panels (matplotlib.subplots
) but I wasn't quit happy with the result.Now, by accident (just curious clicking), I've discovered
ridgeplot
.What would you think of using the look & feel of
ridgeplot
as a plotting function inBaseDistribution
? Where rows are the rows of the data-frame like stucture, and mayb there are also columns (but I am happy with the single-variable case too)The main difference is that the distribution does not need to be estimated via KDE, you already have it in a form where you can access
pdf
,cdf
, etc, completely, and you have the quantile function too which helps with selecting x-axis range.Plotting
cdf
and other distribution defining functions would also be neat, of coursepdf
(if exists), orcdf
(for survival) are already great.Imagined usage, sth like
Dependencies-wise, one could imagine
ridgeplot
as a plotting softdep likematplotlib
orseaborn
, ofskpro
and therefore indirectly ofsktime
.What do you think?