Open brookslogan opened 4 weeks ago
arx_forecaster()
. Just make a new one.Maybe this was described above, but I'm not quite clear. It sounds like the major issue is that calling group_by()
on an epi_df
has a strange effect on the geo_value
(and whatever you are grouping on) that causes it to behave poorly with the epi_workflow
processing. Is that right?
Context
@dsweber2 was just noting that
doesn't fit per-geo models; it actually just ignores the grouping altogether.
We suggested "transposing" the operations, but @rnayebi21 found that
doesn't work either;
.x
doesn't have thegeo_value
column thus lacksepi_df
ness. I believe these problems also apply to when you are trying to do version-faithful backtesting withepix_slide()
.Workarounds seem a little bit of a pain, either
epi_df
inside the slide computation using.x
,.group_key
, and.ref_time_value
, ormutate(geo_value2 = geo_value)
and group by that instead ofgeo_value
. Or justgroup_by(geo_value2 = geo_value)
.]The first workaround seems more modular (you can have a list of forecasters that can all rely on ungrouped slides, rather than having to do a different type of slide call for each one).
Proposal
arx_forecaster()
check specifically if there's a missinggeo_value
and hint that if they were doing a groupedepix_slide()
orepi_slide()
withgeo_value
in the group variables, that won't work, and to do <some workaround / feature> instead.arx_forecaster()
etc. is grouped; if so, eitherMusings
We can also probably make things easier epiprocess-side, by adding a
.keep
parameter if we're not already able to forward togroup_modify()
via dots. But I'm not sure we actually want to... this makes it easier to useepi_slide()
for forecasting when it shouldn't actually be (epix_slide()
should be favored and maybe renamed to make this clear).[@dshemetov points out we should document this geo-grouped
epi_slide
gotcha in epiprocess. And actually fixing what's going wrong is part of a much larger project, epiprocess#223.]