cjvanlissa / tidySEM

54 stars 7 forks source link

Bug: `ggplot2` warnings and missing covariance links (edges) in version 0.2.4 #71

Open rempsyc opened 1 year ago

rempsyc commented 1 year ago

In tidySEM’s latest CRAN version (0.2.4), it seems that ggplot2 warnings appeared that were not in the previous version. Additionally, the covariance edges (paths) between textual and speed in the reprex below has disappeared.

New:

suppressWarnings(suppressPackageStartupMessages(library(lavaan)))
suppressWarnings(suppressPackageStartupMessages(library(tidySEM)))

version$version.string
#> [1] "R version 4.2.2 (2022-10-31 ucrt)"
packageVersion("tidySEM")
#> [1] '0.2.4'
packageVersion("ggplot2")
#> [1] '3.4.2'

model <- "
visual =~ x1 + x2 + x3
textual =~ x4 + x5 + x6
speed =~ x7 + x8 + x9
speed ~ ageyr + grade
textual ~ ageyr + grade
x4 ~~ x5 + x6
textual ~~ speed
"

fit <- sem(model, HolzingerSwineford1939)

layout <- data.frame(
  IV = c("x1", "grade", "", "ageyr", ""),
  M = c("x2", "", "visual", "", ""),
  DV = c("x3", "textual", "", "speed", "x9"),
  DV.items = c(paste0("x", 4:8)))

graph_sem(fit, layout = layout)
#> Warning: Removed 101 rows containing missing values (`geom_path()`).
#> Warning: Removed 1 rows containing missing values (`geom_label()`).


sessionInfo()
#> R version 4.2.2 (2022-10-31 ucrt)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 19045)
#> 
#> Matrix products: default
#> 
#> locale:
#> [1] LC_COLLATE=English_Canada.utf8  LC_CTYPE=English_Canada.utf8   
#> [3] LC_MONETARY=English_Canada.utf8 LC_NUMERIC=C                   
#> [5] LC_TIME=English_Canada.utf8    
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] tidySEM_0.2.4 OpenMx_2.21.8 lavaan_0.6-15
#> 
#> loaded via a namespace (and not attached):
#>   [1] nlme_3.1-162          matrixStats_0.63.0    fs_1.6.2             
#>   [4] httr_1.4.6            rstan_2.26.13         R.cache_0.16.0       
#>   [7] tools_4.2.2           backports_1.4.1       utf8_1.2.3           
#>  [10] R6_2.5.1              colorspace_2.1-0      withr_2.5.0          
#>  [13] tidyselect_1.2.0      gridExtra_2.3         prettyunits_1.1.1    
#>  [16] mnormt_2.1.1          processx_3.8.1        nonnest2_0.5-5       
#>  [19] curl_5.0.0            compiler_4.2.2        progressr_0.13.0     
#>  [22] cli_3.6.1             fastDummies_1.6.3     sandwich_3.0-2       
#>  [25] labeling_0.4.2        scales_1.2.1          checkmate_2.2.0      
#>  [28] psych_2.3.3           mvtnorm_1.1-3         quadprog_1.5-8       
#>  [31] callr_3.7.3           digest_0.6.31         pbivnorm_0.6.0       
#>  [34] StanHeaders_2.26.13   dbscan_1.1-11         rmarkdown_2.21       
#>  [37] R.utils_2.12.2        pkgconfig_2.0.3       htmltools_0.5.5      
#>  [40] parallelly_1.35.0     styler_1.9.1          fastmap_1.1.1        
#>  [43] rlang_1.1.1           rstudioapi_0.14       farver_2.1.1         
#>  [46] generics_0.1.3        zoo_1.8-12            jsonlite_1.8.4       
#>  [49] car_3.1-2             dplyr_1.1.2           R.oo_1.25.0          
#>  [52] inline_0.3.19         magrittr_2.0.3        bayesplot_1.10.0     
#>  [55] loo_2.6.0             texreg_1.38.6         Matrix_1.5-1         
#>  [58] Rcpp_1.0.10           munsell_0.5.0         fansi_1.0.4          
#>  [61] abind_1.4-5           proto_1.0.0           lifecycle_1.0.3      
#>  [64] R.methodsS3_1.8.2     yaml_2.3.7            carData_3.0-5        
#>  [67] CompQuadForm_1.4.3    MASS_7.3-58.3         blavaan_0.4-7        
#>  [70] pkgbuild_1.4.0        plyr_1.8.8            grid_4.2.2           
#>  [73] listenv_0.9.0         parallel_4.2.2        crayon_1.5.2         
#>  [76] lattice_0.20-45       pander_0.6.5          tmvnsim_1.0-2        
#>  [79] knitr_1.42            ps_1.7.5              pillar_1.9.0         
#>  [82] igraph_1.4.2          boot_1.3-28           bain_0.2.8           
#>  [85] future.apply_1.10.0   codetools_0.2-18      stats4_4.2.2         
#>  [88] rstantools_2.3.1      reprex_2.0.2          glue_1.6.2           
#>  [91] evaluate_0.21         V8_4.2.2              data.table_1.14.8    
#>  [94] RcppParallel_5.1.7    vctrs_0.6.2           gtable_0.3.3         
#>  [97] RANN_2.6.1            purrr_1.0.1           future_1.32.0        
#> [100] gsubfn_0.7            ggplot2_3.4.2         xfun_0.39            
#> [103] xtable_1.8-4          coda_0.19-4           tibble_3.2.1         
#> [106] MplusAutomation_1.1.0 globals_0.16.2

Created on 2023-05-11 with reprex v2.0.2

Old

suppressWarnings(suppressPackageStartupMessages(library(lavaan)))
suppressWarnings(suppressPackageStartupMessages(library(tidySEM)))

version$version.string
#> [1] "R version 4.2.2 (2022-10-31 ucrt)"
packageVersion("tidySEM")
#> [1] '0.2.3'
packageVersion("ggplot2")
#> [1] '3.4.2'

model <- "
visual =~ x1 + x2 + x3
textual =~ x4 + x5 + x6
speed =~ x7 + x8 + x9
speed ~ ageyr + grade
textual ~ ageyr + grade
x4 ~~ x5 + x6
textual ~~ speed
"

fit <- sem(model, HolzingerSwineford1939)

layout <- data.frame(
  IV = c("x1", "grade", "", "ageyr", ""),
  M = c("x2", "", "visual", "", ""),
  DV = c("x3", "textual", "", "speed", "x9"),
  DV.items = c(paste0("x", 4:8)))

graph_sem(fit, layout = layout)


sessionInfo()
#> R version 4.2.2 (2022-10-31 ucrt)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 19045)
#> 
#> Matrix products: default
#> 
#> locale:
#> [1] LC_COLLATE=English_Canada.utf8  LC_CTYPE=English_Canada.utf8   
#> [3] LC_MONETARY=English_Canada.utf8 LC_NUMERIC=C                   
#> [5] LC_TIME=English_Canada.utf8    
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] tidySEM_0.2.3 OpenMx_2.21.8 lavaan_0.6-15
#> 
#> loaded via a namespace (and not attached):
#>   [1] nlme_3.1-162          matrixStats_0.63.0    fs_1.6.2             
#>   [4] httr_1.4.6            rstan_2.26.13         R.cache_0.16.0       
#>   [7] tools_4.2.2           backports_1.4.1       utf8_1.2.3           
#>  [10] R6_2.5.1              colorspace_2.1-0      withr_2.5.0          
#>  [13] tidyselect_1.2.0      gridExtra_2.3         prettyunits_1.1.1    
#>  [16] mnormt_2.1.1          processx_3.8.1        nonnest2_0.5-5       
#>  [19] curl_5.0.0            compiler_4.2.2        cli_3.6.1            
#>  [22] fastDummies_1.6.3     sandwich_3.0-2        labeling_0.4.2       
#>  [25] scales_1.2.1          checkmate_2.2.0       psych_2.3.3          
#>  [28] mvtnorm_1.1-3         quadprog_1.5-8        callr_3.7.3          
#>  [31] digest_0.6.31         pbivnorm_0.6.0        StanHeaders_2.26.13  
#>  [34] rmarkdown_2.21        R.utils_2.12.2        pkgconfig_2.0.3      
#>  [37] htmltools_0.5.5       parallelly_1.35.0     styler_1.9.1         
#>  [40] fastmap_1.1.1         rlang_1.1.1           rstudioapi_0.14      
#>  [43] farver_2.1.1          generics_0.1.3        zoo_1.8-12           
#>  [46] jsonlite_1.8.4        dplyr_1.1.2           R.oo_1.25.0          
#>  [49] inline_0.3.19         magrittr_2.0.3        bayesplot_1.10.0     
#>  [52] loo_2.6.0             texreg_1.38.6         Matrix_1.5-1         
#>  [55] Rcpp_1.0.10           munsell_0.5.0         fansi_1.0.4          
#>  [58] proto_1.0.0           lifecycle_1.0.3       R.methodsS3_1.8.2    
#>  [61] yaml_2.3.7            CompQuadForm_1.4.3    MASS_7.3-58.3        
#>  [64] blavaan_0.4-7         pkgbuild_1.4.0        plyr_1.8.8           
#>  [67] grid_4.2.2            parallel_4.2.2        listenv_0.9.0        
#>  [70] crayon_1.5.2          lattice_0.20-45       pander_0.6.5         
#>  [73] tmvnsim_1.0-2         knitr_1.42            ps_1.7.5             
#>  [76] pillar_1.9.0          igraph_1.4.2          boot_1.3-28          
#>  [79] future.apply_1.10.0   codetools_0.2-18      stats4_4.2.2         
#>  [82] rstantools_2.3.1      reprex_2.0.2          glue_1.6.2           
#>  [85] evaluate_0.21         V8_4.2.2              data.table_1.14.8    
#>  [88] RcppParallel_5.1.7    vctrs_0.6.2           gtable_0.3.3         
#>  [91] purrr_1.0.1           future_1.32.0         gsubfn_0.7           
#>  [94] ggplot2_3.4.2         xfun_0.39             xtable_1.8-4         
#>  [97] coda_0.19-4           tibble_3.2.1          MplusAutomation_1.1.0
#> [100] globals_0.16.2

Created on 2023-05-11 with reprex v2.0.2

cjvanlissa commented 1 year ago

Thank you for reporting! I think it was down to the new code for flipping curved edges to avoid crowding, which divided by zero if nodes were exactly aligned. Could you try the update I just pushed?

rempsyc commented 1 year ago

That works very well, thanks so much!

However, it seems that the sign of curvature is now ignored in some contexts. Is it a feature or bug? Reprex below:

New

suppressWarnings(suppressPackageStartupMessages(library(lavaan)))
suppressWarnings(suppressPackageStartupMessages(library(tidySEM)))
suppressWarnings(suppressPackageStartupMessages(library(dplyr)))

packageVersion("tidySEM")
#> [1] '0.2.5.2'

data <- HolzingerSwineford1939
data$visual <- rowMeans(data[paste0("x", 1:3)])
data$textual <- rowMeans(data[paste0("x", 4:6)])
data$speed <- rowMeans(data[paste0("x", 7:9)])

model <- "
textual ~ visual
speed ~ grade
"
fit <- sem(model, data)
layout <- data.frame(
  IV = c("grade", "", ""),
  M = c("", "visual", ""),
  DV = c("textual", "", "speed"))
p <- prepare_graph(fit, layout = layout)

# This works
p$edges <- p$edges %>%
  mutate(curvature = replace(.$curvature, from == "grade" & to == "speed", -40))
plot(p)


# Does this flip the curve?
p$edges <- p$edges %>%
  mutate(curvature = replace(.$curvature, from == "grade" & to == "speed", 40))
plot(p)

Created on 2023-05-12 with reprex v2.0.2

Old

suppressWarnings(suppressPackageStartupMessages(library(lavaan)))
suppressWarnings(suppressPackageStartupMessages(library(tidySEM)))
suppressWarnings(suppressPackageStartupMessages(library(dplyr)))

packageVersion("tidySEM")
#> [1] '0.2.3'

data <- HolzingerSwineford1939
data$visual <- rowMeans(data[paste0("x", 1:3)])
data$textual <- rowMeans(data[paste0("x", 4:6)])
data$speed <- rowMeans(data[paste0("x", 7:9)])

model <- "
textual ~ visual
speed ~ grade
"
fit <- sem(model, data)
layout <- data.frame(
  IV = c("grade", "", ""),
  M = c("", "visual", ""),
  DV = c("textual", "", "speed"))
p <- prepare_graph(fit, layout = layout)

# This works
p$edges <- p$edges %>%
  mutate(curvature = replace(.$curvature, from == "grade" & to == "speed", -40))
plot(p)


# Does this flip the curve?
p$edges <- p$edges %>%
  mutate(curvature = replace(.$curvature, from == "grade" & to == "speed", 40))
plot(p)

Created on 2023-05-12 with reprex v2.0.2

cjvanlissa commented 1 year ago

It's trying to move the curve away from other objects, but I see how this could overwrite user defaults as you illustrate. I have to think more about what the priority should be.. do you have opinions about it?

rempsyc commented 1 year ago

Well, let's think about it this way. If I don't like the old default placement, I can still change it to fit my needs. Whereas, if I don't like the new default placement, is seems like I can't do anything about it because my user choice is being overwritten. So I think user choice should take priority.

In the example above, it is clear that the option that does not overlap is better. But that was a simplified example. In the current model I am working with, which is much larger, I prefer having the curves opposite to the new default.

In the middle of the graph, this is how I want the curved lines (old tidySEM version):

old_tidySEM

Whereas, this is what I am getting with the new version:

new_tidySEM

All in all, I think I also prefer the old line behaviour in general, even though starting the curved edges from the sides takes a bit more space than from top/bottom, but I suppose that's a bit more subjective.

cjvanlissa commented 1 year ago

@rempsyc I think I've fixed this, but would you please have a look?

I need to implement integration tests for the figures :( But how to go about that...

rempsyc commented 1 year ago

Works!

suppressWarnings(suppressPackageStartupMessages(library(lavaan)))
suppressWarnings(suppressPackageStartupMessages(library(tidySEM)))
suppressWarnings(suppressPackageStartupMessages(library(dplyr)))

packageVersion("tidySEM")
#> [1] '0.2.5.3'

data <- HolzingerSwineford1939
data$visual <- rowMeans(data[paste0("x", 1:3)])
data$textual <- rowMeans(data[paste0("x", 4:6)])
data$speed <- rowMeans(data[paste0("x", 7:9)])

model <- "
textual ~ visual
speed ~ grade
"
fit <- sem(model, data)
layout <- data.frame(
  IV = c("grade", "", ""),
  M = c("", "visual", ""),
  DV = c("textual", "", "speed"))
p <- prepare_graph(fit, layout = layout)

# This works
p$edges <- p$edges %>%
  mutate(curvature = replace(.$curvature, from == "grade" & to == "speed", -40))
plot(p)


# Does this flip the curve?
p$edges <- p$edges %>%
  mutate(curvature = replace(.$curvature, from == "grade" & to == "speed", 40))
plot(p)

Created on 2023-06-17 with reprex v2.0.2

And:

new figure lavaanextra

rempsyc commented 1 year ago

I need to implement integration tests for the figures :( But how to go about that...

There is, in fact, a way to integrate tests for figures (though I've never used it myself yet):

what happens if you want to snapshot an image? expect_snapshot_file() provides an alternative workflow that generates one snapshot per expectation, rather than one file per test. Assuming you’re in test-burger.R then the snapshot created by expect_snapshot_file(code_that_returns_path_to_file(), "toppings.png") would be saved in tests/testthat/_snaps/burger/toppings.png. If a future change in the code creates a different file it will be saved in tests/testthat/_snaps/burger/toppings.new.png.

Unlike expect_snapshot() and friends, expect_snapshot_file() can’t provide an automatic diff when the test fails. Instead you’ll need to call snapshot_review(). This launches a Shiny app that allows you to visually review each change and approve it if it’s deliberate.

From: https://testthat.r-lib.org/articles/snapshotting.html#whole-file-snapshotting

cjvanlissa commented 1 year ago

Yes, I've used this once.. the difficult thing is that there are changes we could make to figures that are not breaking but improve functionality, and this still would trigger a breaking test.

Maybe instead I should make a gallery of figures and compare them manually..

rempsyc commented 1 year ago

At easystats we rely on snapshots a lot. And we often make minor non-breaking changes and that triggers the snapshots tests, but we simply review them and accept the changes (when things work as expected). It is part of our workflow now. I think it is not much different for image snapshots. And it is very similar to comparing them manually for similarity like you suggest except you only have to review the differences when there has been a chance instead of all the time.