YuLab-SMU / MicrobiotaProcess

:microbe: A comprehensive R package for deep mining microbiome
https://www.sciencedirect.com/science/article/pii/S2666675823000164
178 stars 37 forks source link

ggdiffbox #43

Open MonaLiu421 opened 2 years ago

MonaLiu421 commented 2 years ago

The package is great, thanks for developing this package, I would like to know if each classification shown here represents an ASV or a taxonomy level such as species and genus, etc. image

xiangpin commented 2 years ago

The s__Bacteroides_fragilis represents the species taxonomy level. As you know, each ASV might be annotated to the corresponding taxonomy, which have different resolutions classification. All classification will be tested in default. If you want to test and extract the significant differential ASV or other specific levels. You can refer to the following codes.

> library(MicrobiotaProcess)
MicrobiotaProcess v1.7.8 For help:
https://github.com/YuLab-SMU/MicrobiotaProcess/issues

If you use MicrobiotaProcess in published research, please cite the
paper:

S Xu, L Zhan, W Tang, Z Dai, L Zhou, T Feng, M Chen, S Liu, X Fu, T Wu,
E Hu, G Yu. MicrobiotaProcess: A comprehensive R package for managing
and analyzing microbiome and other ecological data within the tidy
framework. 04 February 2022, PREPRINT (Version 1) available at Research
Square [https://doi.org/10.21203/rs.3.rs-1284357/v1]

This message can be suppressed by:
suppressPackageStartupMessages(library(MicrobiotaProcess))
> data(mouse.time.mpse)
> mouse.time.mpse
# A MPSE-tibble (MPSE object) abstraction: 4,142 × 11
# OTU=218 | Samples=19 | Assays=Abundance | Taxonomy=Kingdom, Phylum, Class, Order, Family, Genus, Species
   OTU    Sample Abundance time  Kingdom Phylum Class Order Family Genus Species
   <chr>  <chr>      <int> <chr> <chr>   <chr>  <chr> <chr> <chr>  <chr> <chr>
 1 OTU_1  F3D0         579 Early k__Bac… p__Ba… c__B… o__B… f__Mu… g__u… s__un_…
 2 OTU_2  F3D0         345 Early k__Bac… p__Ba… c__B… o__B… f__Mu… g__u… s__un_…
 3 OTU_3  F3D0         449 Early k__Bac… p__Ba… c__B… o__B… f__Mu… g__u… s__un_…
 4 OTU_4  F3D0         430 Early k__Bac… p__Ba… c__B… o__B… f__Mu… g__u… s__un_…
 5 OTU_5  F3D0         154 Early k__Bac… p__Ba… c__B… o__B… f__Ba… g__B… s__un_…
 6 OTU_6  F3D0         470 Early k__Bac… p__Ba… c__B… o__B… f__Mu… g__u… s__un_…
 7 OTU_7  F3D0         282 Early k__Bac… p__Ba… c__B… o__B… f__Mu… g__u… s__un_…
 8 OTU_8  F3D0         184 Early k__Bac… p__Ba… c__B… o__B… f__Ri… g__A… s__un_…
 9 OTU_9  F3D0          45 Early k__Bac… p__Ba… c__B… o__B… f__Mu… g__u… s__un_…
10 OTU_10 F3D0         158 Early k__Bac… p__Ba… c__B… o__B… f__Mu… g__u… s__un_…
# … with 4,132 more rows
> mouse.time.mpse %>% mp_rrarefy() %>% mp_diff_analysis(.abundance=RareAbundance, .group=time) %>% mp_extract_taxatree %>% dplyr::filter(!is.na(LDAmean) & nodeClass=="OTU", keep.td=F)
The otutree is empty in the MPSE object!
# A tibble: 51 × 12
    node label   isTip nodeClass nodeDepth RareAbundanceBySamp… LDAupper LDAmean
   <int> <chr>   <lgl> <chr>         <dbl> <list>                  <dbl>   <dbl>
 1     1 OTU_67  TRUE  OTU               8 <tibble [19 × 4]>        3.39    3.36
 2    10 OTU_4   TRUE  OTU               8 <tibble [19 × 4]>        4.40    4.38
 3    11 OTU_6   TRUE  OTU               8 <tibble [19 × 4]>        4.53    4.52
 4    12 OTU_7   TRUE  OTU               8 <tibble [19 × 4]>        4.17    4.15
 5    15 OTU_12  TRUE  OTU               8 <tibble [19 × 4]>        4.22    4.20
 6    16 OTU_14  TRUE  OTU               8 <tibble [19 × 4]>        3.62    3.59
 7    22 OTU_8   TRUE  OTU               8 <tibble [19 × 4]>        4.09    4.07
 8    27 OTU_11  TRUE  OTU               8 <tibble [19 × 4]>        4.20    4.18
 9    37 OTU_78  TRUE  OTU               8 <tibble [19 × 4]>        3.28    3.24
10    38 OTU_113 TRUE  OTU               8 <tibble [19 × 4]>        3.38    3.31
# … with 41 more rows, and 4 more variables: LDAlower <dbl>, Sign_time <chr>,
#   pvalue <dbl>, fdr <dbl>
MonaLiu421 commented 2 years ago

Thanks for the reply, but I still don't quite understand, I want to know if this s__Bacteroides_fragilis represents an ASV only annotated to this species or represent Bacteroides_fragilis species, possibly including multiple ASVs?

xiangpin commented 2 years ago

represent Bacteroides_fragilis species, unless only an ASV was annotated to s__Bacteroides_fragilis. I think you can use your_mpse_object %>% mp_extract_taxonomy %>% filter(Species=="s__Bacteroides_fragilis") to check whether the ASV annotated to s__Bacteroides_fragilis is only one.

MonaLiu421 commented 2 years ago

thank you. maybe my version of the package is old, I cant use the mpse function. What version of R program(my is 4.0.1) is the new version of the MicrobiotaProcess package based on? the installation seems difficult.

xiangpin commented 2 years ago

R >= 4.1, Or how about remotes::install_github("YuLab-SMU/MicrobiotaProcess") ?

MonaLiu421 commented 2 years ago

After updating r, it has been installed. The usage of the new version is a bit unfamiliar. I would like to ask if there is any explanation?

MonaLiu421 commented 2 years ago

https://yulab-smu.top/MicrobiotaProcessWorkshop/articles/MicrobiotaProcessWorkshop.html,like this.

MonaLiu421 commented 2 years ago

hello, bother you again. I would like to know how to visualize a certain classification level like genus using ggdiffbox. I can't find how to filter. your code like mouse.time.mpse %>% mp_rrarefy() %>% mp_diff_analysis(.abundance=RareAbundance, .group=time) %>% mp_extract_taxatree %>% dplyr::filter(!is.na(LDAmean) & nodeClass=="OTU", keep.td=F) is ok, but it can't plot using ggdiffbox. or plot it by ggplot2 manually?

xiangpin commented 2 years ago

Method 1

> library(MicrobiotaProcess)
MicrobiotaProcess v1.7.8 For help:
https://github.com/YuLab-SMU/MicrobiotaProcess/issues

If you use MicrobiotaProcess in published research, please cite the
paper:

S Xu, L Zhan, W Tang, Z Dai, L Zhou, T Feng, M Chen, S Liu, X Fu, T Wu,
E Hu, G Yu. MicrobiotaProcess: A comprehensive R package for managing
and analyzing microbiome and other ecological data within the tidy
framework. 04 February 2022, PREPRINT (Version 1) available at Research
Square [https://doi.org/10.21203/rs.3.rs-1284357/v1]

This message can be suppressed by:
suppressPackageStartupMessages(library(MicrobiotaProcess))
> data(mouse.time.mpse)
> mouse.time.mpse %>% mp_rrarefy() %>% mp_diff_analysis(.abundance=RareAbundance, .group=time, action='get') %>% dplyr::filter(grepl("^g__", f)) %>% ggdiffbox()

xx2

if the action of mp_diff_analysis is get, the result will return a diffAnalysisClass, then you can use filter to choose what you want.

Method 2

mouse.time.mpse %>% mp_rrarefy() %>% mp_diff_analysis(.abundance=RareAbundance, .group=time) %>% mp_extract_taxatree() %>% dplyr::filter(nodeClass=="Genus" & !is.na(Sign_time), keep.td=FALSE) -> genus.tb

if the action of mp_diff_analysis is add, the differential result will be added to the taxatree slot of MPSE, you can extract the result by mp_extract_taxatree, then you can use filter to choose what you want.

genus.tb %>% tidyr::unnest(RareAbundanceBySample) %>% dplyr::filter(!grepl("^g__un_", label)) %>% ggplot(aes(y=label, x =RelRareAbundanceBySample, fill=time)) + geom_boxplot(orientation='y') -> p1
genus.tb %>% dplyr::filter(!grepl("^g__un_", label)) %>% ggplot(aes(y=label, xmin=LDAlower, xmax=LDAupper)) + geom_errorbar(width=.2) + geom_point(aes(x=LDAmean, y=label, fill=Sign_time), shape=21, size=2) + theme(axis.text.y=element_blank()) + ylab(NULL) -> p2
p3 <- p2 %>% aplot::insert_left(p1)
p3

xx

MonaLiu421 commented 2 years ago

wow, thanks a million. you are so nice.