kylebaron commented 1 year ago

Summary

This plot adds a feature set for creating multipanel displays of diagnostic plots.

ETA versus covariates
NPDE versus covariates
CWRES versus covariates
NPDE goodness of fit plots
CWRES goodness of fit plots

In general

CWRES should behave just like NPDE.
When we plot versus covariates, they can be either categorical or continuous; the code picks that up
There are _list versions of most paneled plots (where it makes sense) that get called under the hood; these _list plots return lists of plots or lists of lists of plots; basically everything up to the arrangement of plots on the grid
Plots in the output lists are all named
When you have this _list output, the user can arrange themselves via a with() method
It's pretty hard to test this stuff; did my best to at lest exercise the code; see pdf attached here and I included the source code under tests/testthat/rmd to run again

ETA

eta_covariate() and eta_covariate_list()
Default output is list of lists, with outer list the ETAs and the inner list are the covariates
These plots can be transposed so that the outer set is the covariates and the inner set is the ETAs
eta_covariate() an arrange by columns or rows
tag_levels gets passed through to patchwork inside eta_covariate()

NPDE - covariate

npde_covariate() and npde_covariate_list()
Similar setup to eta_covariate() but there is only one level rather than two
- npde_covariate() returns a single plot, arranged
- npde_covariate() returns a list of plots

CWRES - coviariate

This should be pretty much identical to the setup we have for npde_covariate

NPDE - diagnostics

npde_panel() gives you a single panel with all the NPDE diagnositcs
npde_panel_list() gives you a list of the component plots
npde_hist_q() gives you just the NPDE histogram and q-q plot; no list option
npde_scatter() gives you the scatter plots - NPDE versus TIME, TAD, PRED; no list option
None of these plots have the possibility of transpose

CWRES - diagnostics

This should be pretty much identical to the setup we have for npde diagnostics

displays.pdf

KatherineKayMRG commented 1 year ago

@kylebaron - I love this idea.

For the eta-cov plots, would it be possible to potentially pass in the configuration of the plots too? So if I want something in this kind of configuration (below), I would pass R

eta_covariate(id, x, etas = y, tag = TRUE, , arrange=(p1 + p5) / (p3 + p4) / p2)
# or 
eta_covariate(id, x, etas = y, tag = TRUE, , arrange=(PTYPE+ PGPI) / (SEX + HEPAT) / RACEC)

So basically can I pass it a customisable patchwork argument - obviously with your new functionality that could include continuous and cat covs

KatherineKayMRG commented 1 year ago

Also, for the npde plot (and maybe the cwres too?) would that have the option to be flexible? So for example, could we give the user the option of which plot to put in the middle? PRED, TIME or TAD?

And what about the option to drop TIME or TAD? Then, if for example users drop TAD, it could naturally make a 2x2 grid with the option to swap PRED and TIME in one column

or

kylebaron commented 9 months ago

@KatherineKayMRG - what do you think about this? Return a list with sensible names that you can arrange?

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(patchwork)
library(pmplots)
#> Loading required package: ggplot2

data <- pmplots_data_obs()
id <- dplyr::distinct(data, ID, .keep_all = TRUE)
x <- c("RF", "AAG", "SCR//Creatinine (mg/dL)", "CPc", "STUDYc", "AST")
y <- c("ETA1//ETA-CL", "ETA2//ETA-V2", "ETA3//ETA-KA")

l <- npde_panel_list(data, xname = "demothizone (ng/mL)")

with(l, (time + tad) / (q + hist) / pred)

with(l, q + hist + tad)

^{Created on 2024-01-24 with reprex v2.0.2}

KatherineKayMRG commented 9 months ago

I love this!

Can you do something similar with other functions? Like the ETA functions (screenshot from MERGE)

Screenshot 2024-01-25 at 10 43 02 AM

kylebaron commented 9 months ago

@KatherineKayMRG - yeah, it'll be same idea for ETAs

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(patchwork)
library(pmplots)
#> Loading required package: ggplot2

data <- pmplots_data_obs()
id <- dplyr::distinct(data, ID, .keep_all = TRUE)
x <- c("RF", "AAG", "SCR//Creatinine (mg/dL)", "CPc", "STUDYc", "AST")
y <- c("ETA1//ETA-CL", "ETA2//ETA-V2", "ETA3//ETA-KA")

Standard panel

eta_covariate(id, x, y)[[1]]
#> `geom_smooth()` using formula = 'y ~ x'
#> `geom_smooth()` using formula = 'y ~ x'
#> `geom_smooth()` using formula = 'y ~ x'

Roll your own

l <- eta_covariate_list(id, x, y)
with(l$ETA1, (RF+AAG)/STUDYc, tag_levels = "A")
#> `geom_smooth()` using formula = 'y ~ x'

^{Created on 2024-01-25 with reprex v2.0.2}

kylebaron commented 9 months ago

Standard panel - by row

x <- c("RF", "CPc", "STUDYc", "AAG", "SCR//Creatinine (mg/dL)",  "AST")
eta_covariate(id, x, y)[[1]]
#> `geom_smooth()` using formula = 'y ~ x'
#> `geom_smooth()` using formula = 'y ~ x'
#> `geom_smooth()` using formula = 'y ~ x'

Standard panel - by column

x <- c("RF", "CPc", "STUDYc", "AAG", "SCR//Creatinine (mg/dL)",  "AST")
eta_covariate(id, x, y, byrow = FALSE)[[1]]
#> `geom_smooth()` using formula = 'y ~ x'
#> `geom_smooth()` using formula = 'y ~ x'
#> `geom_smooth()` using formula = 'y ~ x'

^{Created on 2024-01-25 with reprex v2.0.2}

KatherineKayMRG commented 9 months ago

Very nice. Is there an option for the opposite? For example, if I want to look if there is a weight effect or a RF effect across all covs?

Didn't know eta_covariate existed! What a great option when there are a limited number of covs!

kylebaron commented 9 months ago

Is there an option for the opposite? For example, if I want to look if there is a weight effect or a RF effect across all covs?

Not at the moment; but agree we should be able to do that. I basically want to cover all of the code we currently have in place to do this sort of thing and make doing the "other" thing really easy.

kylebaron commented 9 months ago

What a great option when there are a limited number of covs!

I like this too; it'd be great to start doing more of these panels and cut some figures / pages from the report. And limited covs is great situation to just dump everything out on a single page and move on.

kylebaron commented 9 months ago

@KatherineKayMRG

Transposable display

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(patchwork)
library(pmplots)
#> Loading required package: ggplot2

data <- pmplots_data_obs()
id <- dplyr::distinct(data, ID, .keep_all = TRUE)
x <- c("RF", "AAG", "SCR//Creatinine (mg/dL)", "CPc", "STUDYc", "AST")
y <- c("ETA1//ETA-CL", "ETA2//ETA-V2", "ETA3//ETA-KA")

eta_covariate(id, x, y)[[1]]
#> `geom_smooth()` using formula = 'y ~ x'
#> `geom_smooth()` using formula = 'y ~ x'
#> `geom_smooth()` using formula = 'y ~ x'

eta_covariate(id, x, y, transpose = TRUE, ncol = 1)[c(1,2)]
#> $RF

#> 
#> $AAG
#> `geom_smooth()` using formula = 'y ~ x'
#> `geom_smooth()` using formula = 'y ~ x'
#> `geom_smooth()` using formula = 'y ~ x'

^{Created on 2024-01-25 with reprex v2.0.2}

kylebaron commented 8 months ago

Thanks for reviewing, @kyleam ; I believe everything is addressed.

On your suggestion, I did give vdiffr a try. It seems to work, I think? But I quickly ran into this issue. svg images are generally 800K to 1 MB, and I had 20 MB of these files after first pass. Although we're not subject to cran limitations, I pulled back on this b/c I'm not sure it's sustainable. What do you think?

The svg size issue was cross posted here: https://github.com/r-lib/testthat/issues/1732

Although not a test, I did spin that demo doc into a vignette so at least functionality can be verified that way.

kyleam commented 8 months ago

On your suggestion, I did give vdiffr a try. [...] svg images are generally 800K to 1 MB, and I had 20 MB of these files after first pass

Yeah, I think it makes sense to hold off given those sizes.

ggplot2 and patchwork are coming in much lighter (edit: apologies for my confusing mix of iec-i and iec output):

# ggplot2

$ git ls-tree -lr v3.4.4 tests/testthat | sort -k4,4 -gr | \
  head | numfmt --field=4 --to=iec-i
100644 blob c6a2038aba44c5a65cf7ec34f8a0cce64643a9ba   104Ki tests/testthat/_snaps/coord-map/usa-mercator.svg
100644 blob 3d5fa0b240156383c70438a60efb7b4b120079bb   104Ki tests/testthat/_snaps/coord-map/coord-map-switched-scale-position.svg
100644 blob 17142781de60cb45c78ed218f2e61d2415611f89    90Ki tests/testthat/_snaps/geom-violin/grouping-on-x-and-fill-dodge-width-0-5.svg
100644 blob 56049d8ef60474a49f9083b1eebf524959593105    89Ki tests/testthat/_snaps/geom-violin/grouping-on-x-and-fill.svg
100644 blob 1494c6bd08fc55c884b86872c39308748d9ae75d    53Ki tests/testthat/_snaps/geom-violin/with-smaller-bandwidth-and-points.svg
100644 blob 1db22dd4418ee6feedae58e2223911f74add1862    52Ki tests/testthat/_snaps/geom-violin/with-tails-and-points.svg
100644 blob c86d890439ecec126825a6c8860e69c2888439a4    50Ki tests/testthat/_snaps/guides/align-facet-labels-facets-vertical.svg
100644 blob 84b81e2ba21e501c437e3ded4d9673a38ca2056c    49Ki tests/testthat/_snaps/guides/align-facet-labels-facets-horizontal.svg
100644 blob 8bec1ac1a6865986eb3840c63df315445a68e4ec    48Ki tests/testthat/_snaps/geom-violin/quantiles.svg
100644 blob 86a328e5b52836d3916313bb7b71c6a83fa4f934    48Ki tests/testthat/_snaps/geom-violin/dodging-and-coord-flip.svg

$ git ls-tree -lr v3.4.4 tests/testthat | grep snaps | grep svg | \
  awk '{ SUM += $4 } END { print SUM }' | numfmt --to=iec
3.3M

# patchwork

$ git ls-tree -lr v1.2.0 tests/testthat | sort -k4,4 -gr | \
  head | numfmt --field=4 --to=iec-i
100644 blob ccdad1aa3340c15c2532aac6ef3804a7ea43bd68    47Ki tests/testthat/_snaps/collect_axes/multi-cell-title-and-axis-collection.svg
100644 blob f92b91880180238cb7cfc16cf15dc1a83c0823f6    46Ki tests/testthat/_snaps/arithmetic/adding-to-all-subplots-patchwork-theme-bw.svg
100644 blob 75b2158f7976ad136ea204307129a87dd1a8b388    40Ki tests/testthat/_snaps/arithmetic/adding-to-all-on-level-patchwork-theme-bw.svg
100644 blob 897d826561ecb861224aabdd238cfcb8c5e37f17    34Ki tests/testthat/_snaps/arithmetic/complex-composition-p1-p2-p3-p4.svg
100644 blob 6a2aaefffe61f1e4c046e5b6228f4a15526ca50f    33Ki tests/testthat/_snaps/layout/setting-widths-as-units.svg
100644 blob 84f4baaa8f93f55820e882841b22c20f51a5f453    33Ki tests/testthat/_snaps/layout/setting-heights-as-units.svg
100644 blob 92e7fd0a626c368806bf19dbd677f24d0039ea83    33Ki tests/testthat/_snaps/arithmetic/pack-4-plots-p1-p2-p3-p4.svg
100644 blob a5bd2aa8071e7ad302cd5640cac1bd816d5cafec    33Ki tests/testthat/_snaps/layout/setting-heights.svg
100644 blob 9bc6de34ea29b4dcb2fd64aff9c2d983902aa4ba    33Ki tests/testthat/_snaps/layout/setting-widths.svg
100644 blob 51cabdcbcfb833b26646959f97b356e01fa4804d    33Ki tests/testthat/_snaps/layout/setting-nrow.svg

$ git ls-tree -lr v1.2.0  tests/testthat | grep snaps | grep svg | \
  awk '{ SUM += $4 } END { print SUM }' | numfmt --to=iec
847K

I guess it's probably a just a matter of how complicated the plots are (in terms of number of elements).

kylebaron commented 8 months ago

I hit one snag:

Thanks; I updated that to require_patchwork(); I was hoping that I tested that; but guessing the patchwork namespace was already loaded by the time I got to that test. I'll make a pass to make sure we're requiring patchwork anywhere in this series; maybe should just import it at some point going forward.

kylebaron commented 8 months ago

Actually ... I think it's the with.pm_display() method that really needs to require patchwork; once we get dispatched there, we are expecting to do some patchwork math that should always require that namespace to be loaded.

kyleam commented 8 months ago

Actually ... I think it's the with.pm_display() method that really needs to require patchwork

Hmm, where were you originally thinking of adding the require_patchwork call? with.pm_display was where I was suggesting that you call it (and what I tested locally).

kylebaron commented 8 months ago

@kyleam - I read too quickly and was thinking it was the upstream function that was lacking. Agree with your suggestion that modifying with() fixes this in the most comprehensive way. Thanks.

metrumresearchgroup / pmplots

Diagnostic plot "displays" #77

Summary

ETA

NPDE - covariate

CWRES - coviariate

NPDE - diagnostics

CWRES - diagnostics

Standard panel

Roll your own

Standard panel - by row

Standard panel - by column