YuLab-SMU / MicrobiotaProcess

:microbe: A comprehensive R package for deep mining microbiome
https://www.sciencedirect.com/science/article/pii/S2666675823000164
178 stars 37 forks source link

How to convert phloseq object to MPSE object properly with funtion as.MPSE #30

Open taekwonleo-yuhao opened 3 years ago

taekwonleo-yuhao commented 3 years ago

Hi dear developer,

I read your "ggtreeExtra的开发及其在宏基因组上的应用" from YuLabSMU wechat channel. It is a very nice and helpful tutorial. But I am confused about the MPSE object which you use for downstream analysis. I can not find much useful information about MPSE from google.

I also tried to convert phloseq object to MPSE object with MicrobiotaProcess funtion 'as.MPSE'. But I believed I did not do it properly. Because I can not found Taxonomy Information after convert. On the other hand, I wonder if we can use the phylogenetic tree information from the phloseq object to draw the tree directly?

Thanks, Yuhao

xiangpin commented 3 years ago

The MPSE object was introduced in the newest version, it has not been released. And I will update the vignettes as soon as possible. But you can refer to the example and the manual of the corresponding functions (start with mp_).

Please provide a reproducible example. This is an example of converting phyloseq to MPSE object.

> library(MicrobiotaProcess)

Attaching package: ‘MicrobiotaProcess’

The following object is masked from ‘package:stats’:

    filter

> data(test_otu_data)
> test_otu_data
phyloseq-class experiment-level object
otu_table()   OTU Table:         [ 99 taxa and 12 samples ]
sample_data() Sample Data:       [ 12 samples by 2 sample variables ]
tax_table()   Taxonomy Table:    [ 99 taxa by 7 taxonomic ranks ]
> mpse <- test_otu_data %>% as.MPSE
> mpse
# A MPSE-tibble (MPSE object) abstraction: 1,188 × 12
# OTU=99 | Samples=12 | Assays=Abundance | Taxanomy=Kingdom, Phylum, Class, Order, Family, Genus, Species
   OTU    Sample Abundance sample group Kingdom  Phylum Class Order Family Genus
   <chr>  <chr>      <int> <fct>  <fct> <chr>    <chr>  <chr> <chr> <chr>  <chr>
 1 OTU_1  B11         3995 B11    B     k__Bact… p__Fi… c__B… o__L… f__La… g__L…
 2 OTU_10 B11          605 B11    B     k__Bact… p__Fi… c__B… o__L… f__La… g__L…
 3 OTU_1… B11           57 B11    B     k__Bact… p__Fi… c__C… o__C… f__Ru… g__R…
 4 OTU_1… B11            0 B11    B     k__Bact… p__Fi… c__C… o__C… f__Ru… g__I…
 5 OTU_1… B11            5 B11    B     k__Bact… p__Fi… c__E… o__E… f__Er… g__E…
 6 OTU_1… B11            1 B11    B     k__Bact… p__Fi… c__C… o__C… f__Ru… g__R…
 7 OTU_1… B11            0 B11    B     k__Bact… p__Ba… c__B… o__B… f__Pr… g__P…
 8 OTU_1… B11            0 B11    B     k__Bact… p__Fi… c__C… o__C… f__Ru… g__C…
 9 OTU_1… B11            0 B11    B     k__Bact… p__Ba… c__B… o__S… f__Sp… g__S…
10 OTU_1… B11           34 B11    B     k__Bact… p__Fi… c__C… o__C… f__Ru… g__R…
# … with 1,178 more rows, and 1 more variable: Species <chr>
>

Yes, The newest version has supported it, you can reinstall it by remotes::install_github("YuLab-SMU/MicrobiotaProcess") and perform mp_diff_analysis with parameter action="add", then use mp_extract_tree(type="otutree") to get the phylogenetic tree, mp_extract_tree(type="taxatree") to get the taxonomy tree. You can refer to the following example.

> library(MicrobiotaProcess)

Attaching package: ‘MicrobiotaProcess’

The following object is masked from ‘package:stats’:

    filter

> mpse <- mp_import_qiime2(otuqza="./table.qza", taxaqza="./taxonomy.qza", mapfilename="./sample-metadata.tsv", refseqqza="./refseq.qza", treeqza="./tree.qza")
Warning messages:
1: The treeqza is a tree file, it is parsing by read.tree function.
* It is better to parse it with the function of treeio
* then the treedata or phylo result all can be supported.
2: The number of samples in otu table is not equal the number of samples in sample data.
* The same samples will be extract automatically !
> mpse
# A MPSE-tibble (MPSE object) abstraction: 26,180 × 19
# OTU=770 | Samples=34 | Assays=Abundance | Taxanomy=Kingdom, Phylum, Class, Order, Family, Genus, Speies
   OTU     Sample Abundance barcode.sequence body.site year  month day   subject
   <chr>   <chr>      <dbl> <chr>            <chr>     <chr> <chr> <chr> <chr>
 1 4b5eeb… L1S8        2599 AGCTGACTAGTC     gut       2008  10    28    subjec…
 2 fe30ff… L1S8           0 AGCTGACTAGTC     gut       2008  10    28    subjec…
 3 d29fe3… L1S8           0 AGCTGACTAGTC     gut       2008  10    28    subjec…
 4 868528… L1S8           0 AGCTGACTAGTC     gut       2008  10    28    subjec…
 5 154709… L1S8        1623 AGCTGACTAGTC     gut       2008  10    28    subjec…
 6 1d2e5f… L1S8           5 AGCTGACTAGTC     gut       2008  10    28    subjec…
 7 0305a4… L1S8         243 AGCTGACTAGTC     gut       2008  10    28    subjec…
 8 cb2fe0… L1S8           0 AGCTGACTAGTC     gut       2008  10    28    subjec…
 9 997056… L1S8           3 AGCTGACTAGTC     gut       2008  10    28    subjec…
10 3c9c43… L1S8           0 AGCTGACTAGTC     gut       2008  10    28    subjec…
# … with 26,170 more rows, and 10 more variables: reported.antibiotic.usage <chr>,
#   days.since.experiment.start <chr>, Confidence <dbl>, Kingdom <chr>,
#   Phylum <chr>, Class <chr>, Order <chr>, Family <chr>, Genus <chr>,
#   Speies <chr>
> mpse %<>% mp_rrarefy()
> mpse
# A MPSE-tibble (MPSE object) abstraction: 26,180 × 20
# OTU=770 | Samples=34 | Assays=Abundance, RareAbundance | Taxanomy=Kingdom, Phylum, Class, Order, Family, Genus, Speies
   OTU     Sample Abundance RareAbundance barcode.sequence body.site year  month
   <chr>   <chr>      <dbl>         <dbl> <chr>            <chr>     <chr> <chr>
 1 4b5eeb… L1S8        2599           348 AGCTGACTAGTC     gut       2008  10
 2 fe30ff… L1S8           0             0 AGCTGACTAGTC     gut       2008  10
 3 d29fe3… L1S8           0             0 AGCTGACTAGTC     gut       2008  10
 4 868528… L1S8           0             0 AGCTGACTAGTC     gut       2008  10
 5 154709… L1S8        1623           202 AGCTGACTAGTC     gut       2008  10
 6 1d2e5f… L1S8           5             1 AGCTGACTAGTC     gut       2008  10
 7 0305a4… L1S8         243            38 AGCTGACTAGTC     gut       2008  10
 8 cb2fe0… L1S8           0             0 AGCTGACTAGTC     gut       2008  10
 9 997056… L1S8           3             0 AGCTGACTAGTC     gut       2008  10
10 3c9c43… L1S8           0             0 AGCTGACTAGTC     gut       2008  10
# … with 26,170 more rows, and 12 more variables: day <chr>, subject <chr>,
#   reported.antibiotic.usage <chr>, days.since.experiment.start <chr>,
#   Confidence <dbl>, Kingdom <chr>, Phylum <chr>, Class <chr>, Order <chr>,
#   Family <chr>, Genus <chr>, Speies <chr>
> mpse %<>% mp_diff_analysis(.abundance=RareAbundance, .group=body.site, first.test.alpha=0.01)
> mpse
# A MPSE-tibble (MPSE object) abstraction: 26,180 × 27
# OTU=770 | Samples=34 | Assays=Abundance, RareAbundance, RelRareAbundanceBySample | Taxanomy=Kingdom, Phylum, Class, Order, Family, Genus, Speies
   OTU         Sample Abundance RareAbundance RelRareAbundance… barcode.sequence
   <chr>       <chr>      <dbl>         <dbl>             <dbl> <chr>
 1 4b5eeb3003… L1S8        2599           348            38.8   AGCTGACTAGTC
 2 fe30ff0f71… L1S8           0             0             0     AGCTGACTAGTC
 3 d29fe3c705… L1S8           0             0             0     AGCTGACTAGTC
 4 868528ca94… L1S8           0             0             0     AGCTGACTAGTC
 5 154709e160… L1S8        1623           202            22.5   AGCTGACTAGTC
 6 1d2e5f3444… L1S8           5             1             0.111 AGCTGACTAGTC
 7 0305a4993e… L1S8         243            38             4.23  AGCTGACTAGTC
 8 cb2fe0146e… L1S8           0             0             0     AGCTGACTAGTC
 9 997056ba80… L1S8           3             0             0     AGCTGACTAGTC
10 3c9c437f27… L1S8           0             0             0     AGCTGACTAGTC
# … with 26,170 more rows, and 21 more variables: body.site <chr>, year <chr>, month <chr>,
#   day <chr>, subject <chr>, reported.antibiotic.usage <chr>,
#   days.since.experiment.start <chr>, Confidence <dbl>, Kingdom <chr>,
#   Phylum <chr>, Class <chr>, Order <chr>, Family <chr>, Genus <chr>,
#   Speies <chr>, LDAupper <dbl>, LDAmean <dbl>, LDAlower <dbl>,
#   Sign_body.site <chr>, pvalue <dbl>, fdr <dbl>
> otutree <- mpse %>% mp_extract_tree(type="otutree")
> otutree
'treedata' S4 object'.

...@ phylo:

Phylogenetic tree with 770 tips and 768 internal nodes.

Tip labels:
  35bfc371d940cffdc527b7b4dc954456, a95851baf426c85eae4419617db902a7, 08bd73939f7bfd85daf2eaee7e9c9bf8, a9e0ac523112f40942da575caf1f386e, 033511c7ff4fe93866075cfb9129aa3b, fc5b641a0b0408d99ddfb2a5ba64da59, ...
Node labels:
  root, 1.000, 0.621, , 0.894, 0.627, ...

Rooted; includes branch lengths.

with the following features available:
        '',     'RareAbundanceBySample',        'LDAupper',     'LDAmean',      'LDAlower',
        'Sign_body.site',       'pvalue',       'fdr'.

# The associated data tibble abstraction: 1,538 × 10
# The 'node', 'label' and 'isTip' are from the phylo tree.
    node label   isTip RareAbundanceBy… LDAupper LDAmean LDAlower Sign_body.site
   <int> <chr>   <lgl> <list>              <dbl>   <dbl>    <dbl> <chr>
 1     1 35bfc3… TRUE  <tibble [34 × 1…       NA      NA       NA NA
 2     2 a95851… TRUE  <tibble [34 × 1…       NA      NA       NA NA
 3     3 08bd73… TRUE  <tibble [34 × 1…       NA      NA       NA NA
 4     4 a9e0ac… TRUE  <tibble [34 × 1…       NA      NA       NA NA
 5     5 033511… TRUE  <tibble [34 × 1…       NA      NA       NA NA
 6     6 fc5b64… TRUE  <tibble [34 × 1…       NA      NA       NA NA
 7     7 4980b8… TRUE  <tibble [34 × 1…       NA      NA       NA NA
 8     8 e04721… TRUE  <tibble [34 × 1…       NA      NA       NA NA
 9     9 3b74a9… TRUE  <tibble [34 × 1…       NA      NA       NA NA
10    10 96a7db… TRUE  <tibble [34 × 1…       NA      NA       NA NA
# … with 1,528 more rows, and 2 more variables: pvalue <dbl>, fdr <dbl>
> library(ggtree)
ggtree v3.1.3.992  For help: https://yulab-smu.top/treedata-book/

If you use ggtree in published research, please cite the most appropriate paper(s):

1. Guangchuang Yu. Using ggtree to visualize data on tree-like structures. Current Protocols in Bioinformatics, 2020, 69:e96. doi:10.1002/cpbi.96
2. Guangchuang Yu, Tommy Tsan-Yuk Lam, Huachen Zhu, Yi Guan. Two methods for mapping and visualizing associated data on phylogeny using ggtree. Molecular Biology and Evolution 2018, 35(12):3041-3043. doi:10.1093/molbev/msy194
3. Guangchuang Yu, David Smith, Huachen Zhu, Yi Guan, Tommy Tsan-Yuk Lam. ggtree: an R package for visualization and annotation of phylogenetic trees with their covariates and other associated data. Methods in Ecology and Evolution 2017, 8(1):28-36. doi:10.1111/2041-210X.12628

> otutree %>% ggtree(layout="circular")
padingdun833 commented 3 years ago

Hi dear developer,

issue I have the question like above,I can't build mpse object using those files in the pictures!

xiangpin commented 3 years ago

Which version are you using, I think you can reinstall it by remotes::install_github("YuLab-SMU/MicrobiotaProcess")

xiangpin commented 3 years ago

ps, I also think you should make sure the work path contained the files (table.qza, taxonomy.qza, etc).

padingdun833 commented 3 years ago

Which version are you using, I think you can reinstall it by remotes::install_github("YuLab-SMU/MicrobiotaProcess")

I just upgrade the "MicrobiotaProcess",and it installed sucessfully,but the issue is still exsiting.

padingdun833 commented 3 years ago

ps, I also think you should make sure the work path contained the files (table.qza, taxonomy.qza, etc).

Isn't that example data in the package? There is no data in the package file I opened for installation. Or are these external data?if so,I will see!

cmetadea commented 10 months ago

Hi, I tried to convert phyloseq object (after creating it) but is not successful. here is the script and output :

> data(ps_norm)
Warning message:
In data(ps_norm) : data set ‘ps_norm’ not found

What happened here? Thanks