YuLab-SMU / MicrobiotaProcess

:microbe: A comprehensive R package for deep mining microbiome
https://www.sciencedirect.com/science/article/pii/S2666675823000164
183 stars 37 forks source link

include.lowest = FALSE in mp_filter_taxa seems not work #86

Open Hua-CM opened 1 year ago

Hua-CM commented 1 year ago

When I use the mp_filter_taxa function, I found that include.lowest seems not work, which always return the include.lowest=False reuslt. However, when try it using example dataset mouse.time.mpse, it works fine. I could not figure out why but I think this is very important, because many users may not notice this. My dataset is unpublic, so if you need it to test, welcome to contact me.

> mp_filter_taxa(mp_raw, .abundance = Abundance, min.abun = 1, min.prop = 0.1, iclude.lowest=FALSE)
# A MPSE-tibble (MPSE object) abstraction: 825,086 × 15
# OTU=6763 | Samples=122 | Assays=Abundance | Taxonomy=Kingdom, Phylum, Class, Order, Family, Genus, Species
   OTU    Sample Abundance origin suborigin bioreptype biotype oritype Kingdom     Phylum      Class Order Family Genus Species
   <chr>  <chr>      <int> <chr>  <chr>     <chr>      <chr>   <chr>   <chr>       <chr>       <chr> <chr> <chr>  <chr> <chr>  
 1 OTU_1  JLBX1E         0 JL     BX        1          E       JLE     k__Bacteria p__Proteob… c__G… o__E… f__Ye… g__S… s__Ser…
 2 OTU_2  JLBX1E         1 JL     BX        1          E       JLE     k__Bacteria p__Proteob… c__G… o__B… f__Bu… g__B… s__Par…
 3 OTU_3  JLBX1E         1 JL     BX        1          E       JLE     k__Bacteria p__Proteob… c__G… o__X… f__Rh… g__u… s__un_…
 4 OTU_4  JLBX1E         1 JL     BX        1          E       JLE     k__Bacteria p__Proteob… c__A… o__S… f__Sp… g__S… s__un_…
 5 OTU_5  JLBX1E        11 JL     BX        1          E       JLE     k__Bacteria p__Bactero… c__B… o__F… f__We… g__C… s__Chr…
 6 OTU_6  JLBX1E         0 JL     BX        1          E       JLE     k__Bacteria p__Proteob… c__A… o__R… f__Rh… g__P… s__Phy…
 7 OTU_7  JLBX1E         4 JL     BX        1          E       JLE     k__Bacteria p__Proteob… c__G… o__B… f__Bu… g__B… s__un_…
 8 OTU_8  JLBX1E        27 JL     BX        1          E       JLE     k__Bacteria p__Actinob… c__A… o__M… f__Mi… g__M… s__un_…
 9 OTU_9  JLBX1E         1 JL     BX        1          E       JLE     k__Bacteria p__Proteob… c__G… o__X… f__Rh… g__u… s__un_…
10 OTU_10 JLBX1E         2 JL     BX        1          E       JLE     k__Bacteria p__Proteob… c__G… o__B… f__Ox… g__C… s__Col…
# ℹ 825,076 more rows
# ℹ Use `print(n = ...)` to see more rows
> 
> mp_filter_taxa(mp_raw, .abundance = Abundance, min.abun = 1, min.prop = 0.1, iclude.lowest=TRUE)
# A MPSE-tibble (MPSE object) abstraction: 825,086 × 15
# OTU=6763 | Samples=122 | Assays=Abundance | Taxonomy=Kingdom, Phylum, Class, Order, Family, Genus, Species
   OTU    Sample Abundance origin suborigin bioreptype biotype oritype Kingdom     Phylum      Class Order Family Genus Species
   <chr>  <chr>      <int> <chr>  <chr>     <chr>      <chr>   <chr>   <chr>       <chr>       <chr> <chr> <chr>  <chr> <chr>  
 1 OTU_1  JLBX1E         0 JL     BX        1          E       JLE     k__Bacteria p__Proteob… c__G… o__E… f__Ye… g__S… s__Ser…
 2 OTU_2  JLBX1E         1 JL     BX        1          E       JLE     k__Bacteria p__Proteob… c__G… o__B… f__Bu… g__B… s__Par…
 3 OTU_3  JLBX1E         1 JL     BX        1          E       JLE     k__Bacteria p__Proteob… c__G… o__X… f__Rh… g__u… s__un_…
 4 OTU_4  JLBX1E         1 JL     BX        1          E       JLE     k__Bacteria p__Proteob… c__A… o__S… f__Sp… g__S… s__un_…
 5 OTU_5  JLBX1E        11 JL     BX        1          E       JLE     k__Bacteria p__Bactero… c__B… o__F… f__We… g__C… s__Chr…
 6 OTU_6  JLBX1E         0 JL     BX        1          E       JLE     k__Bacteria p__Proteob… c__A… o__R… f__Rh… g__P… s__Phy…
 7 OTU_7  JLBX1E         4 JL     BX        1          E       JLE     k__Bacteria p__Proteob… c__G… o__B… f__Bu… g__B… s__un_…
 8 OTU_8  JLBX1E        27 JL     BX        1          E       JLE     k__Bacteria p__Actinob… c__A… o__M… f__Mi… g__M… s__un_…
 9 OTU_9  JLBX1E         1 JL     BX        1          E       JLE     k__Bacteria p__Proteob… c__G… o__X… f__Rh… g__u… s__un_…
10 OTU_10 JLBX1E         2 JL     BX        1          E       JLE     k__Bacteria p__Proteob… c__G… o__B… f__Ox… g__C… s__Col…
# ℹ 825,076 more rows
# ℹ Use `print(n = ...)` to see more rows
> mp_filter_taxa(mp_raw, .abundance = Abundance, min.abun = 2, min.prop = 0.1, iclude.lowest=TRUE)
# A MPSE-tibble (MPSE object) abstraction: 545,462 × 15
# OTU=4471 | Samples=122 | Assays=Abundance | Taxonomy=Kingdom, Phylum, Class, Order, Family, Genus, Species
   OTU    Sample Abundance origin suborigin bioreptype biotype oritype Kingdom     Phylum      Class Order Family Genus Species
   <chr>  <chr>      <int> <chr>  <chr>     <chr>      <chr>   <chr>   <chr>       <chr>       <chr> <chr> <chr>  <chr> <chr>  
 1 OTU_1  JLBX1E         0 JL     BX        1          E       JLE     k__Bacteria p__Proteob… c__G… o__E… f__Ye… g__S… s__Ser…
 2 OTU_2  JLBX1E         1 JL     BX        1          E       JLE     k__Bacteria p__Proteob… c__G… o__B… f__Bu… g__B… s__Par…
 3 OTU_3  JLBX1E         1 JL     BX        1          E       JLE     k__Bacteria p__Proteob… c__G… o__X… f__Rh… g__u… s__un_…
 4 OTU_4  JLBX1E         1 JL     BX        1          E       JLE     k__Bacteria p__Proteob… c__A… o__S… f__Sp… g__S… s__un_…
 5 OTU_5  JLBX1E        11 JL     BX        1          E       JLE     k__Bacteria p__Bactero… c__B… o__F… f__We… g__C… s__Chr…
 6 OTU_6  JLBX1E         0 JL     BX        1          E       JLE     k__Bacteria p__Proteob… c__A… o__R… f__Rh… g__P… s__Phy…
 7 OTU_7  JLBX1E         4 JL     BX        1          E       JLE     k__Bacteria p__Proteob… c__G… o__B… f__Bu… g__B… s__un_…
 8 OTU_8  JLBX1E        27 JL     BX        1          E       JLE     k__Bacteria p__Actinob… c__A… o__M… f__Mi… g__M… s__un_…
 9 OTU_9  JLBX1E         1 JL     BX        1          E       JLE     k__Bacteria p__Proteob… c__G… o__X… f__Rh… g__u… s__un_…
10 OTU_10 JLBX1E         2 JL     BX        1          E       JLE     k__Bacteria p__Proteob… c__G… o__B… f__Ox… g__C… s__Col…
# ℹ 545,452 more rows
# ℹ Use `print(n = ...)` to see more rows
> mp_filter_taxa(mp_raw, .abundance = Abundance, min.abun = 2, min.prop = 0.1, iclude.lowest=FALSE)
# A MPSE-tibble (MPSE object) abstraction: 545,462 × 15
# OTU=4471 | Samples=122 | Assays=Abundance | Taxonomy=Kingdom, Phylum, Class, Order, Family, Genus, Species
   OTU    Sample Abundance origin suborigin bioreptype biotype oritype Kingdom     Phylum      Class Order Family Genus Species
   <chr>  <chr>      <int> <chr>  <chr>     <chr>      <chr>   <chr>   <chr>       <chr>       <chr> <chr> <chr>  <chr> <chr>  
 1 OTU_1  JLBX1E         0 JL     BX        1          E       JLE     k__Bacteria p__Proteob… c__G… o__E… f__Ye… g__S… s__Ser…
 2 OTU_2  JLBX1E         1 JL     BX        1          E       JLE     k__Bacteria p__Proteob… c__G… o__B… f__Bu… g__B… s__Par…
 3 OTU_3  JLBX1E         1 JL     BX        1          E       JLE     k__Bacteria p__Proteob… c__G… o__X… f__Rh… g__u… s__un_…
 4 OTU_4  JLBX1E         1 JL     BX        1          E       JLE     k__Bacteria p__Proteob… c__A… o__S… f__Sp… g__S… s__un_…
 5 OTU_5  JLBX1E        11 JL     BX        1          E       JLE     k__Bacteria p__Bactero… c__B… o__F… f__We… g__C… s__Chr…
 6 OTU_6  JLBX1E         0 JL     BX        1          E       JLE     k__Bacteria p__Proteob… c__A… o__R… f__Rh… g__P… s__Phy…
 7 OTU_7  JLBX1E         4 JL     BX        1          E       JLE     k__Bacteria p__Proteob… c__G… o__B… f__Bu… g__B… s__un_…
 8 OTU_8  JLBX1E        27 JL     BX        1          E       JLE     k__Bacteria p__Actinob… c__A… o__M… f__Mi… g__M… s__un_…
 9 OTU_9  JLBX1E         1 JL     BX        1          E       JLE     k__Bacteria p__Proteob… c__G… o__X… f__Rh… g__u… s__un_…
10 OTU_10 JLBX1E         2 JL     BX        1          E       JLE     k__Bacteria p__Proteob… c__G… o__B… f__Ox… g__C… s__Col…
# ℹ 545,452 more rows
# ℹ Use `print(n = ...)` to see more rows
xiangpin commented 1 year ago

You can try a larger min.abun value and include.lowest=T (meaning >= min.abun) or include.lowest=F (meaning >min.abun), and you can extract the specified assay to make statistics.

> mouse.time.mpse %>% mp_extract_assays(.abundance=Abundance) %>% `>`(1) %>% table()
.
FALSE  TRUE
 2521  1621
> mouse.time.mpse %>% mp_extract_assays(.abundance=Abundance) %>% `>=`(1) %>% table()
.
FALSE  TRUE
 2521  1621
> mouse.time.mpse %>% mp_extract_assays(.abundance=Abundance) %>% `>`(2) %>% table()
.
FALSE  TRUE
 2535  1607
> mouse.time.mpse %>% mp_extract_assays(.abundance=Abundance) %>% `>=`(2) %>% table()
.
FALSE  TRUE
 2521  1621
> mouse.time.mpse %>% mp_extract_assays(.abundance=Abundance) %>% `>`(3) %>% table()
.
FALSE  TRUE
 2570  1572
> mouse.time.mpse %>% mp_extract_assays(.abundance=Abundance) %>% `>=`(3) %>% table()
.
FALSE  TRUE
 2535  1607
> mouse.time.mpse %>% mp_extract_assays(.abundance=Abundance) %>% `>`(4) %>% table()
.
FALSE  TRUE
 2609  1533
> mouse.time.mpse %>% mp_extract_assays(.abundance=Abundance) %>% `>=`(4) %>% table()
.
FALSE  TRUE
 2570  1572