pcastellanoescuder / POMA

:package: Omics Data Analysis Tools
GNU General Public License v3.0
11 stars 5 forks source link

Inconsistencies between PomaLimma() function and PomaVolcano results (Bioconductor version 1.12) #13

Closed pbaCamille closed 5 months ago

pbaCamille commented 6 months ago

Hello,

I am using the 1.12 version from Bioconductor. I found some inconsistencies between PomaLimma() function and PomaVolcano results. I think the volcano plot does not use the right feature names.

imputed <- PomaImpute(st000336, ZerosAsNA = TRUE, RemoveNA = TRUE, cutoff = 20, method = "knn")
normalized <- PomaNorm(imputed, method = "log_pareto")
pre_processed <- PomaOutliers(normalized, do = "clean")
limma = PomaLimma(pre_processed, contrast = "Controls-DMD", adjust = "fdr")
limma

# A tibble: 30 × 7
   feature             logFC AveExpr     t       P.Value    adj.P.Val     B
   <chr>               <dbl>   <dbl> <dbl>         <dbl>        <dbl> <dbl>
 1 tryptophan         -0.775 0.00862 -7.01 0.00000000269 0.0000000807 11.0 
 2 valine             -0.701 0.0127  -6.63 0.0000000116  0.000000173   9.55
 3 ornithine          -0.633 0.0337  -6.26 0.0000000479  0.000000479   8.16
 4 isoleucine         -0.606 0.00438 -5.95 0.000000159   0.00000119    6.99
 5 lactate            -0.785 0.0184  -5.69 0.000000430   0.00000258    6.02
 6 pyruvate           -0.624 0.0121  -5.43 0.00000112    0.00000514    5.09
 7 leucine            -0.616 0.0153  -5.41 0.00000120    0.00000514    5.02
 8 methionine         -0.552 0.0197  -4.91 0.00000764    0.0000287     3.22
 9 serine             -0.536 0.0448  -4.76 0.0000132     0.0000440     2.69
10 x1_methylhistidine -0.563 0.0373  -4.73 0.0000147     0.0000440     2.59
# ℹ 20 more rows
# ℹ Use `print(n = ...)` to see more rows

volcano = PomaVolcano(pre_processed, method = "limma", labels = TRUE)
volcano[["data"]]

         pvalue      logFC              names      threshold
1  2.691438e-09 -0.7749207 x1_methylhistidine Down-regulated
2  1.155051e-08 -0.7009883 x3_methylhistidine Down-regulated
3  4.794292e-08 -0.6327809            alanine Down-regulated
4  1.591994e-07 -0.6058485           arginine Down-regulated
5  4.296261e-07 -0.7853613         asparagine Down-regulated
6  1.117445e-06 -0.6244615      aspartic_acid Down-regulated
7  1.198180e-06 -0.6155105      glutamic_acid Down-regulated
8  7.642323e-06 -0.5517786          glutamine           <NA>
9  1.321166e-05 -0.5358998            glycine           <NA>
10 1.467187e-05 -0.5629464          histidine           <NA>

As you can see the names used in the volcano plot are not consistent with the feature names obtained with PomaLimma().

sessionInfo()
R version 4.3.3 (2024-02-29)
Platform: x86_64-conda-linux-gnu (64-bit)
Running under: Ubuntu 20.04.6 LTS

Matrix products: default
BLAS/LAPACK: /home/trottier-c/miniconda3/envs/poma/lib/libopenblasp-r0.3.26.so;  LAPACK version 3.12.0

locale:
 [1] LC_CTYPE=fr_FR.UTF-8       LC_NUMERIC=C               LC_TIME=fr_FR.UTF-8        LC_COLLATE=fr_FR.UTF-8    
 [5] LC_MONETARY=fr_FR.UTF-8    LC_MESSAGES=fr_FR.UTF-8    LC_PAPER=fr_FR.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=fr_FR.UTF-8 LC_IDENTIFICATION=C       

time zone: Europe/Paris
tzcode source: system (glibc)

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] SummarizedExperiment_1.32.0 Biobase_2.62.0              GenomicRanges_1.54.1        GenomeInfoDb_1.38.8        
 [5] IRanges_2.36.0              S4Vectors_0.40.2            BiocGenerics_0.48.1         MatrixGenerics_1.14.0      
 [9] matrixStats_1.2.0           data.table_1.15.4           plotly_4.10.4               ggraph_2.2.1               
[13] ggplot2_3.5.0               POMA_1.12.0                

loaded via a namespace (and not attached):
 [1] tidyselect_1.2.1        viridisLite_0.4.2       dplyr_1.1.4             farver_2.1.1            viridis_0.6.5          
 [6] bitops_1.0-7            fastmap_1.1.1           lazyeval_0.2.2          RCurl_1.98-1.14         tweenr_2.0.3           
[11] digest_0.6.35           lifecycle_1.0.4         cluster_2.1.6           statmod_1.5.0           magrittr_2.0.3         
[16] compiler_4.3.3          rlang_1.1.3             tools_4.3.3             igraph_2.0.3            utf8_1.2.4             
[21] yaml_2.3.8              knitr_1.46              labeling_0.4.3          S4Arrays_1.2.1          graphlayouts_1.1.1     
[26] htmlwidgets_1.6.4       DelayedArray_0.28.0     abind_1.4-5             withr_3.0.0             purrr_1.0.2            
[31] grid_4.3.3              polyclip_1.10-6         fansi_1.0.6             colorspace_2.1-0        scales_1.3.0           
[36] MASS_7.3-60.0.1         cli_3.6.2               rmarkdown_2.26          vegan_2.6-4             crayon_1.5.2           
[41] generics_0.1.3          rstudioapi_0.16.0       httr_1.4.7              cachem_1.0.8            ggforce_0.4.2          
[46] zlibbioc_1.48.2         splines_4.3.3           parallel_4.3.3          impute_1.76.0           XVector_0.42.0         
[51] vctrs_0.6.5             Matrix_1.6-5            jsonlite_1.8.8          ggrepel_0.9.5           limma_3.58.1           
[56] tidyr_1.3.1             glue_1.7.0              gtable_0.3.4            munsell_0.5.1           tibble_3.2.1           
[61] pillar_1.9.0            htmltools_0.5.8.1       GenomeInfoDbData_1.2.11 R6_2.5.1                tidygraph_1.3.1        
[66] evaluate_0.23           lattice_0.22-6          memoise_2.0.1           Rcpp_1.0.12             nlme_3.1-164           
[71] gridExtra_2.3           SparseArray_1.2.4       permute_0.9-7           mgcv_1.9-1              xfun_0.43              
[76] pkgconfig_2.0.3       
pcastellanoescuder commented 5 months ago

Hey @pbaCamille ! I'd suggest to use the latest POMA version from GitHub, 1.13.26. Install via:

devtools::install_github("pcastellanoescuder/POMA")

And then try to re-run your code:

library(POMA)

imputed <- PomaImpute(st000336, zeros_as_na = TRUE, remove_na = TRUE, cutoff = 20, method = "knn")
normalized <- PomaNorm(imputed, method = "log_pareto")
pre_processed <- PomaOutliers(normalized)
limma <- PomaLimma(pre_processed$data, contrast = "Controls-DMD", adjust = "fdr")

limma %>% 
  dplyr::select(feature, logFC, adj_pvalue) %>% 
  PomaVolcano(labels = TRUE)
Screenshot 2024-04-14 at 9 39 48 AM