Open andrewbaxter439 opened 2 years ago
Changed to 95%CIs in dc063e5
Other key question @dkopasker is what would be best to show the CIs - this is using ribbons but can also do line-range for each point instead.
Conducting multiple runs is the equivalent of bootstrapping. The 95% confidence interval is defined by the observations ranked at the lowest and highest 2.5 percentile, this is why ranking has been performed. This may give a narrow range, but I don't think it is what you have done above.
A ribbon that includes the areas between points would be my preference for displaying confidence intervals. Others would object to this since we do not have observations between annual points. It's a matter of preference that is likely to be decided by a peer reviewer. Code to do both would be useful.
Please label the y-axis and state what the red line means. A note indicated how many runs were conducted would also be useful. I also prefer a white background with one black and one grey line for publications.
Thanks @dkopasker - minor facepalm there for me then, I do see now what you mean! Will sort these points shortly
@andrewbaxter439 Any progress so far?
apologies - not had a chance to look at this again since last changes. Next plans would be to turn into function runnable for each variable. Is there a key deadline for this?
not had a chance to look at this again since last changes
No worries.
Is there a key deadline for this?
I'm not aware of any, however, we need to ask @dkopasker when it comes to deadlines.
Not a deadline as such. I expect we will re-run the 1,000 run simulations near the end of this month with the intention of writing up. It would be helpful if publication quality graphs with confidence intervals can be readily produced by the large output files.
I's say mid-September is our target date then. I expect to be knee-deep in the code, teaching, and on leave till the end of this year, it' be great not to delay the development of essential parts of the code. Any minor corrections/improvements could be introduced later as we go.
Created a function in R/graphing_functions.R
which can now graph a baseline/comparison line graph with assignable y-axis labels.
library(readr)
library(tidyverse)
source("R/graphing_functions.R")
out_data <-
read_csv("C:/Programming/covid19_effect_estimates/data/new_data.csv",
show_col_types = FALSE)
out_data |>
graph_policy_comparisons(out_ghq_baseline, out_ghq_reform, y_lab = "GQH score")
#> Loading required package: SPHSUgraphs
Thanks Andy. This looks useful. For comparison, could you please form a graph without ribbons but with the uncertainty interval marked for each point only? As mentioned above, some may object to areas between points being covered by a confidence interval when such points were not estimated.
Eventually, we will need to produce these graphs using the median. It may be worth retaining code for both the mean and median to allow for the preferences of peer reviewers.
It may be worth retaining code for both the mean and median
That's a given, I'll try to fix this.
Added two options for using mean instead of median for points (CIs should still be quantiles I imagine?) and using other geoms such as errorbar for ranges:
out_data |>
graph_policy_comparisons(
out_ghq_baseline,
out_ghq_reform,
y_lab = "GQH score",
agg_method = mean,
ci_geom = "errorbar"
)
As a quick initial run of creating graphs, here's the running of the code currently in
R/outputting graphs.R
:As a small initial point - these outputs currently have a very small range (accidently put just as standard error of means across 50 runs in file). Should intervals combine the sd's of the means of each run, rather than taking the variance between the mean outputs?
Suggested edits so far:
Created on 2022-07-15 by the reprex package (v2.0.1)