Open elray1 opened 10 months ago
A proposed goal would be to allow users to write code like this:
ggplot() +
geom_stepahead_prediction(
mapping = aes(x = target_date),
data = model_output_tbl,
interval_levels = c(0.5, 0.8)
)
ggplot() +
geom_stepahead_prediction(
mapping = aes(x = target_date, group = model_id),
data = model_output_tbl,
interval_levels = c(0.5, 0.8)
)
ggplot() +
geom_stepahead_prediction(
mapping = aes(x = target_date, group = c(model_id, reference_date)),
data = model_output_tbl,
interval_levels = c(0.5, 0.8)
)
ggplot() +
geom_stepahead_prediction(
mapping = aes(x = target_date),
data = model_output_tbl,
interval_levels = c(0.5, 0.8)
) +
facet_wrap(~ model_id)
We should also support things like color, fill, etc. It may be helpful to do this by doing some setup and then calling ggdist::geom_lineribbon
, or else it might be best just to manually add a geom_line
and some geom_ribbon
s.
It may also be that rather than geom_stepahead_prediction
, we instead need to write stat_stepahead_prediction
which does conversions as necessary (in a way that's specific to the output_type
) of a model_output_tbl
into the thing that's naturally plotted by geom_...
(I'm not really sure how this works).
I like the idea and would welcome the opportunity to dig deeper into low level ggplot programming
A few scattered thoughts (from someone without much low level ggplot experience)...
geom_lineribbon
from ggdist
seemed at first like an appealing solution to build a wrapper around, but I came away from my investigation with 2 main concerns:
1) A design premise for geom_lineribbon
seems to be that a sufficient set of basic visual ingredients for plotting a series of distributions (predictive or otherwise) along a time axis are a point element and an interval element. I kind of like this idea but also felt like it was something that should be discussed before any one person devoted much time into building the wrapper.
2) Assuming that the "point + interval" ingredient list is the right approach, the issue then arises of how much to use ggdist
's facilities from obtaining such elements. The basic tool for geom_lineribbon
seems to be the point_interval
family of functions in ggdist
. If you dig into these, you start to get the sense that they operate in pretty ad hoc ways by wrapping base R functions and, more importantly, are really oriented toward samples from distributions rather than functionals of distributions (mean, median, support, etc) which for me is the motivating principle of the output_type_id
column. I think the point_interval
functions also serve as S3 methods for objects from the packages distributional
and posterior
, but writing code around that felt like even more of a buy-in to a package eco-system that I don't have any experience with.
See also some brief discussion about the possibility of using forecast::geom_forecast
here.
r.e. Aaron's point 1 just above, it does seem to me that it would be nice if we could make plots of step ahead forecasts that only had point predictions (no intervals, e.g. if plots with intervals are too crowded or probabilistic forecasts were not collected), or plots of intervals only (e.g. if someone wants to avoid calling attention to the point predictions).
:wave: Hi from the developer of {distributional}
and forecast::geom_forecast()
.
We've also been working on various ways to visualise forecasts at @acefa-hubs. I'm happy to help with discussion, design, or development if it would be useful for you!
The idea is that this would make it easier to build up a plot layer by layer, allowing for easier customization by users who are familiar with ggplot, rather than calling a function that "does it all".
It might be possible to base these on functionality in the ggdist package here.