ChiLiubio / microeco

An R package for data analysis in microbial community ecology
GNU General Public License v3.0
194 stars 56 forks source link

Is it possible to use microeco to screen for 'rare/abundant taxa'? #402

Closed makerer5 closed 1 week ago

makerer5 commented 3 weeks ago

Hi Chi I used microeco to study the diversity of microbial communities, and now we want to explore the 'rare/abundant taxa' in the study area. Can I use microeco to filter out 'rare/abundant taxa'? I noticed that microeco has a 'filter_taxa function', but this is not consistent with the filter range I want to set. How can I use the 'filter_taxa function' to correctly filter out the following abundance taxa: (1) always abundant taxa (AAT) were defined as the OTUs with abundance ≥ 1% in all samples; (2) always rare taxa (ART) were defined as the OTUs with abundance < 0.01% in all samples; (3) moderate taxa (MT) were defined as OTUs with abundance between 0.01 and 1% in all samples; (4) conditionally rare taxa (CRT) were defined as with abundance below 1% in all samples and < 0.01% in some samples; (5) conditionally abundant taxa (CAT) were defined as taxa with abundance ≥ 0.01% in all samples and ≥ 1% in some samples but never rare (< 0.01%); and (6) conditionally rare and abundant taxa (CRAT) were defined as OTUs with abundance varying from rare (< 0.01%) to abundant (≥ 1%).

ChiLiubio commented 3 weeks ago

Hi. Yes. Here is the example. First prepare the data.

library(microeco)
library(magrittr)
# prepare dataset
d1 <- microtable$new(otu_table_16S)
t1 <- trans_norm$new(d1)
# TSS means the relative abundance
d1 <- t1$norm(method = "TSS")

It is simple to use apply function to do this.

# 1 AAT
sel_logical <- apply(d1$otu_table, 1, function(x){all(x > 0.01)})
AAT <- rownames(d1$otu_table)[sel_logical]
# 2 ART
sel_logical <- apply(d1$otu_table, 1, function(x){all(x < 0.0001)})
ART <- rownames(d1$otu_table)[sel_logical]
#3 MT
sel_logical <- apply(d1$otu_table, 1, function(x){all(x > 0.0001 & x < 0.01)})
MT <- rownames(d1$otu_table)[sel_logical]
# 4 CRT
sel_logical <- apply(d1$otu_table, 1, function(x){all(x < 0.01) & any(x < 0.0001)})
CRT <- rownames(d1$otu_table)[sel_logical]
# 5 CAT
sel_logical <- apply(d1$otu_table, 1, function(x){all(x > 0.0001) & any(x > 0.01)})
CAT <- rownames(d1$otu_table)[sel_logical]
# 6 CRAT
sel_logical <- apply(d1$otu_table, 1, function(x){any(x > 0.01) & any(x < 0.0001)})
CRAT <- rownames(d1$otu_table)[sel_logical]

I also use the filter_taxa function of microeco to do this, though it seems like complex.

# 1 AAT
tmp <- clone(d1)
tmp$otu_table[d1$otu_table < 0.01] <- 0
tmp$filter_taxa(freq = ncol(d1$otu_table))
# shows no features is remained
# 2 ART; 0 abundance is also ok? another word, donot consider frequency? If so:
tmp <- clone(d1)
tmp$otu_table[d1$otu_table < 0.0001] <- 0
tmp$filter_taxa(freq = 1)
ART2 <- rownames(d1$otu_table) %>% .[!. %in% rownames(tmp$otu_table)]
#3 MT
tmp <- clone(d1)
tmp$otu_table[d1$otu_table > 0.01] <- 0
tmp$otu_table[d1$otu_table < 0.0001] <- 0
tmp$filter_taxa(freq = ncol(d1$otu_table))
# 4 CRT
CRT_rmhigh <- clone(d1)
CRT_rmhigh$otu_table[d1$otu_table < 0.01] <- 0
CRT_rmhigh$filter_taxa(freq = 1)
CRT_rmlow <- clone(d1)
CRT_rmlow$otu_table[d1$otu_table < 0.0001] <- 0
CRT_rmlow$filter_taxa(freq = ncol(d1$otu_table))
CRT2 <- rownames(d1$otu_table) %>% .[!. %in% rownames(CRT_rmhigh$otu_table)] %>% .[!. %in% rownames(CRT_rmhigh$otu_table)]
# 5 CAT
CAT_low <- clone(d1)
CAT_low$otu_table[d1$otu_table < 0.0001] <- 0
CAT_low$filter_taxa(freq = ncol(d1$otu_table))
CAT_high <- clone(d1)
CAT_high$otu_table[d1$otu_table < 0.01] <- 0
CAT_high$filter_taxa(freq = 1)
CAT2 <- intersect(rownames(CAT_low$otu_table), rownames(CAT_high$otu_table))
# 6 CRAT
CRAT_rmlow <- clone(d1)
CRAT_rmlow$otu_table[d1$otu_table <= 0.0001] <- 0
CRAT_rmlow$filter_taxa(freq = ncol(d1$otu_table))
CRAT_low <- rownames(d1$otu_table) %>% .[!. %in% rownames(CRAT_rmlow$otu_table)]
CRAT_high <- clone(d1)
CRAT_high$otu_table[d1$otu_table < 0.01] <- 0
CRAT_high$filter_taxa(freq = 1)
CRAT2 <- intersect(CRAT_low, rownames(CRAT_high$otu_table))
makerer5 commented 3 weeks ago

Hi Chi I successfully ran the code you gave me, and the results were stored in the form of 'character'. Does the microeco package have such a function? Can I filter the entire microeco R6 object for 'rare/abundant taxa'? In other words, can I filter the dataset composed of otu_table, tax_table, sam_table, and tree?

ChiLiubio commented 3 weeks ago

Sure. What you need to do is to manipulate the otu_table and then tidy the object.

library(microeco)
library(magrittr)
# prepare dataset with more info for the demonstration
d1 <- microtable$new(otu_table_16S, tax_table = taxonomy_table_16S)
t1 <- trans_norm$new(d1)
# TSS means the relative abundance
d1 <- t1$norm(method = "TSS")

# 2 ART
sel_logical <- apply(d1$otu_table, 1, function(x){all(x < 0.0001)})
ART <- rownames(d1$otu_table)[sel_logical]

# the solution
ART_mt <- clone(d1)
ART_mt$otu_table %<>% .[ART, ]
ART_mt
ART_mt$tidy_dataset()
ART_mt