Week of 5/1/2023 & 5/8/2023

juliabruneau commented 1 year ago

Tasks from our weekly meeting:

[x] Does grid.arrange() automatically generate a third row?
[x] Test markdown with 3 models
[x] Setting the figure height/width to different sizes in different chunks
[x] Add side-by-side method to Rmd where needed (Use glist = list() and glist[r] = png::readPNG for getting images)
[x] Adding an additional scatter plot to the existing scatter plot if there are more than 2 models
[ ] Editing the scatter plots to be at a 1:1 ratio
[x] Adding a gray 45 degree line in the background of the scatter plots
[x] Adding an additional box plot to our current box plot figure if there are more than 2 models
- Model1 - Model 2
- Model 1 - Model 3
[ ] Formatting the FTABLEs output
[ ] Editing the loops to be more efficient?

Discoveries:

fread makes the runtime significatly shorter than read.csv
using multiple opts_chunk$set() in between chunks sets figures at different heights and widths, now we can fit 2 figures per page if wanted:
grid_arrange() does NOT generate an automatic third row, but autofits 3 side-by-side:
Adding nrow = length(models)-1 will add a row if there is 3 models:

juliabruneau commented 1 year ago

Command to run 3 models at the same time:

rmarkdown::render('C:/Users/VT_SA/Documents/GitHub/HARParchive/HARP-2022-Summer/AutomatedScripts/ws_model_summary.Rmd',
                             output_file = 'C:/Users/VT_SA/Documents/HARP/MarkdownSummaryTest',
                             params = list( doc_title = ("Test HSP2 Model Summary"),
                             rseg.file.path = c("/media/model/p6/out/river/hsp2_2022/hydr/JA4_7280_7340_hydrd_wy.csv", "/media/model/p6/out/river/subsheds/hydr/JA4_7280_7340_hydrd_wy.csv", "/media/model/p6/out/river/subsheds/hydr/JA4_7280_7340_hydrd_wy.csv" ),
                             rseg.hydrocode = c("JA4_7280_7340","vahydrosw_wshed_JA4_7280_7340", "vahydrosw_wshed_JA4_7280_7340"),
                             rseg.ftype = c("cbp60","vahydro", "vahydro"),
                             rseg.model.version = c("cbp-6.0","cbp-6.1", "cbp-6.1"),
                             runid.list = c("hsp2_2022","subsheds", "subsheds"),
                             rseg.metric.list = c("7q10", "l90_Qout", "l30_year")
                            )
               )

Scatterplot with 3 models in it:

If we could find 3 models within a river segment, it would give us a more realistic result, since this is just repeating one of the models twice for testing purposes. @rburghol @jdkleiner

Still working on how to get 2 boxplots side-by-side at the same x-axis position.

juliabruneau commented 1 year ago

Box Plot with 3 Models

ignore the data - it was just made up in order to have 3 models...

I was able to plot the flow metrics at the same xaxis location:

This was done by first stacking both model %diff data into one column, and then adding a 3rd column that gave a mutual column name to both the mean values, l90 values....

values       ind   V3
1 -0.0123069 Mean Flow Mean
2 -0.0123069 Mean Flow Mean
3 -0.0123069 Mean Flow Mean
4 -0.0123069 Mean Flow Mean
5 -0.0123069 Mean Flow Mean
6 -0.0123069 Mean Flow Mean
...
values  ind V3
215 -23.94453 l1.1 l1
216 -46.53405 l1.1 l1
217 -83.37953 l1.1 l1
218 -29.90746 l1.1 l1
219 -55.41005 l1.1 l1
220 -90.19874 l1.1 l1

That way I was able to fill the colors in the box plots by the flow metric. Overall code:

if (length(rseg.model.version) == 3) {

  binded <- cbind(perc.diff.data, perc.diff.data.2)[order(c(seq_along(perc.diff.data), seq_along(perc.diff.data.2)))]
  box.data <- stack(binded)
  mean <- rep("Mean", times = nrow(binded)*2)
  l90 <- rep("l90", times = nrow(binded)*2)
  l30 <- rep("l30", times = nrow(binded)*2)
  l7 <- rep("l7", times = nrow(binded)*2)
  l1 <- rep("l1", times = nrow(binded)*2)
  box.data[3] <- rep(c(mean,l90,l30,l7,l1), times = 1)

  perc.diff.fig <- ggplot() +
    geom_boxplot(box.data, mapping = aes(x = ind, y = values, fill = V3), position = "dodge") +
    geom_boxplot(outlier.colour="red", outlier.shape=8,
                  outlier.size=3) +
    stat_boxplot(geom ='errorbar') + 
    scale_x_discrete(name = "Models Compared",
                     labels = rep(c("1v2","1v3"),5)) +
    scale_y_continuous(name = "% Difference") +
    scale_fill_brewer() + 
    labs(fill="Flow Metric") +
    theme_light()

  perc.diff.fig
}

juliabruneau commented 1 year ago

Work Left on the Markdown - Final Update from Julia

I was able to finish generating the scatterplots and boxplots with 3 models. Those details are commented above.

The 1 thing with actual code that I didn't finish is to display the scatter plots at a 1:1 ratio.

I have tried aspect.ratio within theme(), but I wasn't able to get the figures to display at the extent of the full output grid (they were really small)
- other potential solution could be altering the axis length and width to be identical
When changing the display size of the figures, it alters the labels at each figure, so those will need to be edited after the figure ratios get changed
I added the 45 degree gray line in the background already as discussed

Other final tasks include overall formatting any annotations/figure captions, and seeing what the best output figure heights and widths we should set for all the different figures. Now that we know it can be done in between each chunk, it should be pretty straight forward.

Then finally testing with 3 models that are actually from the same river segments, and 3 different models within it.

HARPgroup / HARParchive