willwerscheid / flashier

A faster and angrier package for EBMF.
https://willwerscheid.github.io/flashier/
Other
10 stars 12 forks source link

flashier:::plot.flash reorders the loadings #133

Open parsifal9 opened 2 months ago

parsifal9 commented 2 months ago

Hi willwerscheid,

the function flashier:::plot.flash is reordering the loadings for some of my examples. I am not sure why. It does not reorder them for the example in the vignette.

extracting the "L" and "F" matrices shows no reordering. It is only apparent in the plot function. In fact, going through the code, I think the reordering happens in the ggplot functions. However adding the last line in this snippet to flashier:::plot.flash fixes the problem for me.

  val <- val[, kset, drop = FALSE]
    pm.df <- data.frame(val = val) %>% tibble::rownames_to_column(var = "Name")
    pm.df <- pm.df %>% tidyr::pivot_longer(-Name, names_to = "k",  values_to = "val") %>% dplyr::mutate(k = as.numeric(str_remove(k, "val.")))
    pm.df <- pm.df %>% left_join(pve.df, by = "k")
    pm.df$Name <- factor(pm.df$Name, levels = unique(pm.df$Name))

Bye

pcarbo commented 2 months ago

@parsifal9 Are you using the most recent version of flashier that is available on GitHub?

parsifal9 commented 2 months ago

Hi @pcarbo ,

I was using the cran version 1.0.7 but I just installed 1.0.53 and the same problem appears.

> sessionInfo()
R version 4.3.2 (2023-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.4 LTS

Matrix products: default
BLAS:   /usr/local/R/R-4.3.2/lib/R/lib/libRblas.so 
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3;  LAPACK version 3.10.0

locale:
 [1] LC_CTYPE=en_AU.UTF-8       LC_NUMERIC=C               LC_TIME=en_AU.UTF-8        LC_COLLATE=en_AU.UTF-8     LC_MONETARY=en_AU.UTF-8   
 [6] LC_MESSAGES=en_AU.UTF-8    LC_PAPER=en_AU.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_AU.UTF-8 LC_IDENTIFICATION=C       

time zone: Australia/Sydney
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] cowplot_1.1.3     viridis_0.6.5     viridisLite_0.4.2 lubridate_1.9.3   forcats_1.0.0     stringr_1.5.1     dplyr_1.1.4       purrr_1.0.2      
 [9] readr_2.1.5       tidyr_1.3.1       tibble_3.2.1      tidyverse_2.0.0   ggplot2_3.5.1     flashier_1.0.53   ebnm_1.1-34      

loaded via a namespace (and not attached):
 [1] softImpute_1.4-1     gtable_0.3.5         htmlwidgets_1.6.4    ggrepel_0.9.6        lattice_0.21-9       tzdb_0.4.0           quadprog_1.5-8      
 [8] vctrs_0.6.5          tools_4.3.2          generics_0.1.3       parallel_4.3.2       Polychrome_1.5.1     fansi_1.0.6          pkgconfig_2.0.3     
[15] Matrix_1.6-5         data.table_1.16.0    SQUAREM_2021.1       RColorBrewer_1.1-3   RcppParallel_5.1.9   scatterplot3d_0.3-44 lifecycle_1.0.4     
[22] truncnorm_1.0-9      farver_2.1.2         compiler_4.3.2       progress_1.2.3       munsell_0.5.1        RhpcBLASctl_0.23-42  htmltools_0.5.8.1   
[29] lazyeval_0.2.2       plotly_4.10.4        pillar_1.9.0         crayon_1.5.3         uwot_0.2.2           trust_0.1-8          gtools_3.9.5        
[36] tidyselect_1.2.1     digest_0.6.37        stringi_1.8.4        Rtsne_0.17           ashr_2.2-63          labeling_0.4.3       splines_4.3.2       
[43] fastmap_1.2.0        grid_4.3.2           colorspace_2.1-1     cli_3.6.3            invgamma_1.1         magrittr_2.0.3       utf8_1.2.4          
[50] withr_3.0.1          prettyunits_1.2.0    scales_1.3.0         timechange_0.3.0     horseshoe_0.2.0      httr_1.4.7           fastTopics_0.6-192  
[57] gridExtra_2.3        deconvolveR_1.2-1    hms_1.1.3            pbapply_1.7-2        irlba_2.3.5.1        rlang_1.1.4          Rcpp_1.0.13         
[64] mixsqp_0.3-54        glue_1.7.0           jsonlite_1.8.8       R6_2.5.1            
> 
pcarbo commented 2 months ago

Okay thanks @parsifal9. Yes, it is possible there is a bug in our ggplot2 code. If you can possibly share an example that reproduces the error you get, that would be helpful for us to pinpoint the bug and fix it.

parsifal9 commented 2 months ago

Hi @pcarbo,

in plot.flash

pm.df[1:10,1:5]
# A tibble: 10 × 5
   Name                           k     val     pve k.order
   <chr>                      <dbl>   <dbl>   <dbl>   <dbl>
 1 ENSMPUG00000000004_Unknown     1 -0.358  0.827         1
 2 ENSMPUG00000000004_Unknown     2  0.0127 0.0338        2
 3 ENSMPUG00000000004_Unknown     3  0.218  0.0197        3

The Name column is a character vector and the line aes(x = Name, y = val, fill = Name) treats Name as a factor and orders the levels alphabetically. This can result in the columns being plotted in the wrong order.

I can not share that data, as it comes from someone's experiment. But here is a small example

example_df <- data.frame(
  Name = c("B", "A", "C", "B", "A", "C"),
  val = c(1, 2, 3, 4, 5, 6),
  k.order = c(1, 1, 1, 2, 2, 2)
)
print(example_df)

And a reordered plot

library(ggplot2)
p1 <- ggplot(example_df, aes(x = Name, y = val, fill = Name)) + 
  geom_col() + 
  facet_wrap(~k.order) + 
  theme_minimal() + 
  labs(title = "Without Fix", x = "", y = "")
print(p1)

However by setting Name as a factor, we keep the original order

example_df$Name <- factor(example_df$Name, levels = unique(example_df$Name))

p2 <- ggplot(example_df, aes(x = Name, y = val, fill = Name)) + 
  geom_col() + 
  facet_wrap(~k.order) + 
  theme_minimal() + 
  labs(title = "With Fix", x = "", y = "")
print(p2)
pcarbo commented 2 months ago

Thank you @parsifal9, I agree it appears to be a bug. I will use your example to troubleshoot and correct the error. I may not get to it immediately, but I will leave this Issue open as a reminder.