drostlab / myTAI

Evolutionary Transcriptomics with R
https://drostlab.github.io/myTAI/
GNU General Public License v2.0
39 stars 16 forks source link

`CollapseReplicates()` always returns `Phylostratum` as the first column. #36

Closed LotharukpongJS closed 1 year ago

LotharukpongJS commented 1 year ago

Describe the bug CollapseReplicates() returns Divergence.stratum as Phylostrata.

To Reproduce

> data("DivergenceExpressionSetExample")
> CollapseReplicates(ExpressionSet = DivergenceExpressionSetExample[1:5,1:9], 
+                    nrep          = c(2,2,3), 
+                    FUN           = mean, 
+                    stage.names   = c("S1","S2","S3"))
# A tibble: 5 × 5
  Phylostratum GeneID         S1    S2    S3
         <int> <fct>       <dbl> <dbl> <dbl>
1            1 at1g01050.1 1659. 1615. 1228.
2            1 at1g01120.1  816.  896.  869.
3            1 at1g01140.3  975. 1018.  997.
4            1 at1g01170.1 1202. 1219. 4824.
5            1 at1g01230.1  920.  949.  836.

Yet the input DivergenceExpressionSetExample has the first column Divergence.stratum.

> head(DivergenceExpressionSetExample)
  Divergence.stratum      GeneID    Zygote  Quadrant  Globular     Heart   Torpedo      Bent     Mature
1                  1 at1g01050.1 1501.0141 1817.3086 1665.3089 1564.7612 1496.3207 1114.6435  1071.6555
2                  1 at1g01120.1  844.0414  787.5929  859.6267  931.6180  942.8453  870.2625   792.7542
3                  1 at1g01140.3 1041.4291  908.3929 1068.8832  967.7490 1055.1901 1109.4662   825.4633
4                  1 at1g01170.1 1361.6646 1042.1991 1225.5625 1211.7386 1674.5224 2136.4284 10662.4763
5                  1 at1g01230.1  894.1276  946.6993  933.0931  965.1859  870.9218  843.1814   794.6536
6                  1 at1g01540.2 1464.3065 1451.4255 2378.7054 1993.9326 1800.2420 2119.9220  1020.2640

Expected behaviour The column name used as the first column in the dataset that satisfies myTAI::is.ExpressionSet() should be returned.

Session info:

Matrix products: default LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib

locale: [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages: [1] stats4 stats graphics grDevices utils datasets methods base

other attached packages: [1] DESeq2_1.38.3 SummarizedExperiment_1.28.0 Biobase_2.58.0 MatrixGenerics_1.10.0
[5] matrixStats_1.0.0 GenomicRanges_1.50.2 GenomeInfoDb_1.34.9 IRanges_2.32.0
[9] S4Vectors_0.36.2 BiocGenerics_0.44.0 myTAI_1.0.1.9000 lubridate_1.9.2
[13] forcats_1.0.0 stringr_1.5.0 dplyr_1.1.2 purrr_1.0.2
[17] readr_2.1.4 tidyr_1.3.0 tibble_3.2.1 ggplot2_3.4.2
[21] tidyverse_2.0.0

loaded via a namespace (and not attached): [1] colorspace_2.1-0 ellipsis_0.3.2 rprojroot_2.0.3 XVector_0.38.0 fs_1.6.3
[6] rstudioapi_0.15.0 farver_2.1.1 remotes_2.4.2.1 bit64_4.0.5 AnnotationDbi_1.60.2
[11] fansi_1.0.4 codetools_0.2-19 splines_4.2.2 cachem_1.0.8 geneplotter_1.76.0
[16] knitr_1.43 pkgload_1.3.2.1 annotate_1.76.0 png_0.1-8 shiny_1.7.4.1
[21] compiler_4.2.2 httr_1.4.6 Matrix_1.5-4.1 fastmap_1.1.1 cli_3.6.1
[26] later_1.3.1 htmltools_0.5.5 prettyunits_1.1.1 tools_4.2.2 gtable_0.3.3
[31] glue_1.6.2 GenomeInfoDbData_1.2.9 Rcpp_1.0.11 Biostrings_2.66.0 vctrs_0.6.3
[36] iterators_1.0.14 xfun_0.40 ps_1.7.5 timechange_0.2.0 mime_0.12
[41] miniUI_0.1.1.1 lifecycle_1.0.3 devtools_2.4.5 XML_3.99-0.14 MASS_7.3-60
[46] zlibbioc_1.44.0 scales_1.2.1 vroom_1.6.3 hms_1.1.3 promises_1.2.1
[51] parallel_4.2.2 RColorBrewer_1.1-3 yaml_2.3.7 curl_5.0.1 memoise_2.0.1
[56] see_0.8.0 stringi_1.7.12 RSQLite_2.3.1 desc_1.4.2 foreach_1.5.2
[61] pkgbuild_1.4.2 BiocParallel_1.32.6 rlang_1.1.1 pkgconfig_2.0.3 bitops_1.0-7
[66] evaluate_0.21 lattice_0.21-8 htmlwidgets_1.6.2 labeling_0.4.2 bit_4.0.5
[71] processx_3.8.2 tidyselect_1.2.0 ggsci_3.0.0 magrittr_2.0.3 R6_2.5.1
[76] generics_0.1.3 profvis_0.3.8 DelayedArray_0.24.0 DBI_1.1.3 pillar_1.9.0
[81] withr_2.5.0 fitdistrplus_1.1-11 survival_3.5-5 KEGGREST_1.38.0 RCurl_1.98-1.12
[86] crayon_1.5.2 utf8_1.2.3 tzdb_0.4.0 rmarkdown_2.23 urlchecker_1.0.1
[91] usethis_2.2.2 locfit_1.5-9.8 grid_4.2.2 blob_1.2.4 callr_3.7.3
[96] digest_0.6.33 xtable_1.8-4 httpuv_1.6.11 munsell_0.5.0 sessioninfo_1.2.2

HajkD commented 1 year ago

Hi @LotharukpongJS ,

Thank you very much for catching this!

I now fixed the issue that had a hard-coded Phylostratum name assignment for the first column, which is now changed to a generic renaming based on the first column name of the user-input dataset.

The fix now works as intended:

CollapseReplicates(ExpressionSet = DivergenceExpressionSetExample[1:5,1:9], 
                                      nrep          = c(2,2,3), 
                                       FUN           = mean, 
                                       stage.names   = c("S1","S2","S3"))
# A tibble: 5 × 5
  Divergence.stratum GeneID         S1    S2    S3
               <int> <fct>       <dbl> <dbl> <dbl>
1                  1 at1g01050.1 1659. 1615. 1228.
2                  1 at1g01120.1  816.  896.  869.
3                  1 at1g01140.3  975. 1018.  997.
4                  1 at1g01170.1 1202. 1219. 4824.
5                  1 at1g01230.1  920.  949.  836.

With many thanks and very best wishes, Hajk

LotharukpongJS commented 1 year ago

Hi @HajkD,

Many thanks for the fix!

Best wishes, Sodai