Open nickreich opened 2 months ago
I've addressed most of these issues. A few follow up notes:
model_id
column." Happy to add a bit more detail if it's needed. (1) I can manually force tables to be only on one page if we prefer this (I do agree it looks cleaner), but note that I have also not found a way to do this otherwise.
(4) I can take care of moving this to somewhere in section 5 since I agree it does belong there more than in section 4.
The paper looks great! I have a series of small comments, none of which I really see as essential or should hold up submission, but these are more minor details that I think would make it all just a bit tighter.
[x] Table 1: (a) consider making the team name simpler. as it is now, it is distracting (hard for an initiated reader to understand what "MOBS-GLEAM_FLUH" is) and it requires the table to wrap to 2 lines per row. How about just "teamX-modelY" or something like that? or at least explain what that team/model abbreviation is. (b) can it be printed to not wrap onto a second page? (c) consider sorting by horizon, so the values for each modeling task are in order.
[x] in general, is the model_id column ever introduced or discussed? if not, maybe adding one sentence about what its use it would be valuable.
[x] There is a phrase now in the intro that reads: "Complementing other software for combining predictions from multiple models (e.g., (Pedregosa et al. 2011; Weiss et al. 2019; Bosse et al. 2023; Couch and Kuhn 2023)), hubEnsembles supports..." which has a second set of parentheses that should be removed if possible.
[x] Table 2: same comment as above with model_id. either explain what the abbreviation is or make it easier to understand.
[x] "The output_type_id provides additional identifying information for a prediction and is specific to the particular output_type (see Table 1)." should be "see Table 3".
[x] Figure 1: somewhere it should be mentioned that the y-axis is log scale. maybe on the axis label itself and in the legend?
[x] I'd suggest referencing Figure 1 when first introducing the mathematical notation for the ensemble in section 2, as the visual depiction of the calculations may be helpful for many readers not familiar with these ideas. The figure seems best where it is since it builds on definitions in section 3, but maybe a mention like "For a visual depiction of these equations, see Figure 1 below."
[x] section 4 says "The figures from here on are generated re- producibly using data from rds files stored in the analysis/data/raw-data directory and scripts in the inst directory." but then also "We will use an example hub provided by the hubverse to demonstrate the functionality of the hubEnsembles package". It's not clear to me which is being used.
[x] Figure 5: it looks like two levels of prediction intervals are shown, but this is not described in the legend.
[x] in at least one place, "Flusight" is written, instead of "FluSight" with the second capital letter. it should be the "FluSight".
[x] arguments inside the
simple_ensemble()
call are not appropriately indented in section 5.1. The lines with weights, agg_fun and model_id should start a few spaces in. This is what they look like now in the manuscripthubEnsembles::simple_ensemble(
weights = NULL,
agg_fun = "mean",
model_id = "mean-ensemble" +)
[x] Add citation to hubverse documentation to first sentence of second paragraph of the discussion