ChiLiubio / microeco

An R package for data analysis in microbial community ecology
GNU General Public License v3.0
181 stars 55 forks source link

relative abundance for dominant functions for 16S using FAPROTAX #320

Open biazen123 opened 5 months ago

biazen123 commented 5 months ago

Hi Chi Firstly thank you very much for microeco package. I try to run the relative abundance for dominant functions using the package's dataset using the following script. library(microeco) fun_16S <- clone(dataset) fun_16S <- trans_func$new(fun_16S)

mapping the taxonomy to the FAPROTAX database

fun_16S$cal_spe_func(prok_database = "FAPROTAX") The functional binary table is stored in object$res_spe_func ... fun_16S$cal_spe_func() The functional binary table is stored in object$res_spe_func ...

fun_16S$cal_spe_func_perc(abundance_weighted = TRUE) The result table is stored in object$res_spe_func_perc ...

Firstly I try to plot for all functional groups.

fun_16S$plot_spe_func_perc() + theme(axis.line.x = element_blank(), axis.text.x = element_blank(), plot.title = element_text(hjust = -0.099, vjust = 1, size = 10, face = "bold")) + ggtitle(paste("Functional groups for Soil Prokaryotes using FAPROTAX")) Dataset_fun_ALL

Perform relative abundance for dominant functions between groups

fun_16S_micro <- microtable$new(as.data.frame(t(fun_16S$res_spe_func_perc)), sample_table = dataset$sample_table) fun_16S_micro$taxa_abund$OTU <- fun_16S_micro$otu_table fun_16S_micro$tidy_dataset() fun_16S_micro

microtable-class object: sample_table have 90 rows and 4 columns otu_table have 37 rows and 90 columns Taxa abundance: calculated for OTU

Secondly, I try to compare and plot the dominant functions between groups based on relative abundance ( displayed in decimal) as follows.

##plot the functional abundance. m2 <- trans_abund$new(fun_16S_micro, taxrank = "OTU", ntaxa = 12, use_percentage = FALSE) The transformed abundance data is stored in object$data_abund ... class(m2$data_abund) fun_bar <- m2$plot_bar(others_color = "grey70",legend_text_italic = FALSE, facet = "Group", xtext_keep = FALSE, order_x = c("CW", "IW", "TW")) fun_bar

Dataset by Group

Coming to my Issues I have seen a large share of the aerobic_chemoheterotrophy group in the first plot. There is no chemoheterotrophy functional group in the first plot. However, in the second plot, the value of the aerobic_chemoheterotrophy group (first plot) is divided into two groups (aerobic_chemoheterotrophy and chemoheterotrophy). Why this is happening? could you please suggest me to fix it? Thank you very much for your continuous service. Best

ChiLiubio commented 5 months ago

Hi. If you run colnames(fun_16S$res_spe_func) or colnames(fun_16S$res_spe_func_perc), you can see there are three similar names: "aerobic_chemoheterotrophy", "chemoheterotrophy" and "anaerobic_chemoheterotrophy". Actually, the initial FAPROTAX database matching result has no "anaerobic_chemoheterotrophy", becase FAPROTAX only have "chemoheterotrophy" and "aerobic_chemoheterotrophy" items. The "anaerobic_chemoheterotrophy" is generated by the function automatically according to "chemoheterotrophy" minus "aerobic_chemoheterotrophy". So the final results have three. In the first plot, I think it is enough to show "aerobic_chem" and "anaerobic_chem", so the default guilds have no "chemoheterotrophy" in the function. In your customized analysis, you can manipulate them as you want. They are not in conflict. The data cleaning is very easy in this step fun_16S_micro$taxa_abund$OTU <- fun_16S_micro$otu_table.

Best, Chi

biazen123 commented 4 months ago

Hi Thank you very much for the clarity. Best.

biazen123 commented 4 months ago

Hi. If you run colnames(fun_16S$res_spe_func) or colnames(fun_16S$res_spe_func_perc), you can see there are three similar names: "aerobic_chemoheterotrophy", "chemoheterotrophy" and "anaerobic_chemoheterotrophy". Actually, the initial FAPROTAX database matching result has no "anaerobic_chemoheterotrophy", becase FAPROTAX only have "chemoheterotrophy" and "aerobic_chemoheterotrophy" items. The "anaerobic_chemoheterotrophy" is generated by the function automatically according to "chemoheterotrophy" minus "aerobic_chemoheterotrophy". So the final results have three. In the first plot, I think it is enough to show "aerobic_chem" and "anaerobic_chem", so the default guilds have no "chemoheterotrophy" in the function. In your customized analysis, you can manipulate them as you want. They are not in conflict. The data cleaning is very easy in this step fun_16S_micro$taxa_abund$OTU <- fun_16S_micro$otu_table.

Best, Chi

Dear Chi Hi Could you suggest me how to generate the relative proportion of unclassified ASVs for functional groups (FAPROTAX) compared with the the total ASVs? best

ChiLiubio commented 4 months ago

Hi. I think it is enough to get what you need by runing 1 - apply(fun_16S$res_spe_func, 2, sum)/nrow(fun_16S$res_spe_func).