waldronlab / bugsigdbr

R-side access to published microbial signatures from BugSigDB
https://bioconductor.org/packages/bugsigdbr
GNU General Public License v3.0
4 stars 2 forks source link

[BUG] Study 722 is missing k__bacteria from MetaPhlAn taxon names #43

Closed cmirzayi closed 1 year ago

cmirzayi commented 1 year ago

There is a bug where Study 722 is missing the kingdom for all taxa curated which then creates a downstream issue where getSignatures() fails due to taxa that should be "k__bacteria" are "" instead. While getSignatures() should probably be updated to remove blank taxa as a redundancy, it also is not good that this curated taxon is missing due to some sort of bug.

Context

As described and discussed in https://github.com/waldronlab/bugsigdbr/issues/42#issuecomment-1555213510.

Small reproducible example

library(bugsigdbr)
x <- importBugSigDB(version = 'devel', cache = FALSE)
x[x$Study=="Study 722",]$`MetaPhlAn taxon names`

R session information

Remember to include your full R session information.

> sessioninfo::session_info()
─ Session info ──────────────────────────────────────────────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.2.1 (2022-06-23 ucrt)
 os       Windows 10 x64 (build 19045)
 system   x86_64, mingw32
 ui       RStudio
 language (EN)
 collate  English_United States.utf8
 ctype    English_United States.utf8
 tz       America/New_York
 date     2023-05-19
 rstudio  2023.03.1+446 Cherry Blossom (desktop)
 pandoc   NA

─ Packages ──────────────────────────────────────────────────────────────────────────────────────────────────────────
 package       * version date (UTC) lib source
 BiocFileCache   2.4.0   2022-04-26 [1] Bioconductor
 bit             4.0.5   2022-11-15 [1] CRAN (R 4.2.3)
 bit64           4.0.5   2020-08-30 [1] CRAN (R 4.2.1)
 blob            1.2.4   2023-03-17 [1] CRAN (R 4.2.3)
 bugsigdbr     * 1.3.1   2022-10-05 [1] Github (waldronlab/bugsigdbr@2d4b3f9)
 cachem          1.0.8   2023-05-01 [1] CRAN (R 4.2.3)
 cli             3.4.1   2022-09-23 [1] CRAN (R 4.2.1)
 crayon          1.5.2   2022-09-29 [1] CRAN (R 4.2.1)
 curl            5.0.0   2023-01-12 [1] CRAN (R 4.2.2)
 DBI             1.1.3   2022-06-18 [1] CRAN (R 4.2.1)
 dbplyr          2.3.2   2023-03-21 [1] CRAN (R 4.2.3)
 dplyr         * 1.1.2   2023-04-20 [1] CRAN (R 4.2.3)
 fansi           1.0.4   2023-01-22 [1] CRAN (R 4.2.3)
 fastmap         1.1.1   2023-02-24 [1] CRAN (R 4.2.3)
 filelock        1.0.2   2018-10-05 [1] CRAN (R 4.2.1)
 generics        0.1.3   2022-07-05 [1] CRAN (R 4.2.1)
 glue            1.6.2   2022-02-24 [1] CRAN (R 4.2.1)
 httr            1.4.6   2023-05-08 [1] CRAN (R 4.2.3)
 lifecycle       1.0.3   2022-10-07 [1] CRAN (R 4.2.2)
 magrittr        2.0.3   2022-03-30 [1] CRAN (R 4.2.1)
 memoise         2.0.1   2021-11-26 [1] CRAN (R 4.2.1)
 pillar          1.9.0   2023-03-22 [1] CRAN (R 4.2.3)
 pkgconfig       2.0.3   2019-09-22 [1] CRAN (R 4.2.1)
 purrr           1.0.1   2023-01-10 [1] CRAN (R 4.2.3)
 R6              2.5.1   2021-08-19 [1] CRAN (R 4.2.1)
 rappdirs        0.3.3   2021-01-31 [1] CRAN (R 4.2.1)
 Rcpp            1.0.10  2023-01-22 [1] CRAN (R 4.2.3)
 rlang           1.1.1   2023-04-28 [1] CRAN (R 4.2.3)
 RSQLite         2.3.1   2023-04-03 [1] CRAN (R 4.2.3)
 rstudioapi      0.14    2022-08-22 [1] CRAN (R 4.2.1)
 sessioninfo     1.2.2   2021-12-06 [1] CRAN (R 4.2.2)
 tibble          3.2.1   2023-03-20 [1] CRAN (R 4.2.3)
 tidyselect      1.2.0   2022-10-10 [1] CRAN (R 4.2.2)
 tzdb            0.4.0   2023-05-12 [1] CRAN (R 4.2.3)
 utf8            1.2.3   2023-01-31 [1] CRAN (R 4.2.3)
 vctrs           0.6.2   2023-04-19 [1] CRAN (R 4.2.3)
 vroom           1.6.3   2023-04-28 [1] CRAN (R 4.2.3)
 withr           2.5.0   2022-03-03 [1] CRAN (R 4.2.1)

 [1] C:/Users/cmirz/AppData/Local/R/win-library/4.2
 [2] C:/Program Files/R/R-4.2.1/library
lgeistlinger commented 1 year ago

Thanks @cmirzayi. I've added a fix to bugsigdbr as part of #42 and reported to waldronlab/BugSigDB (https://github.com/waldronlab/BugSigDB/issues/180).