ChiLiubio / microeco

An R package for data analysis in microbial community ecology
GNU General Public License v3.0
185 stars 56 forks source link

Reorder taxa in abundance plot- input_taxaname doesn't reorder taxa #153

Open calla404 opened 1 year ago

calla404 commented 1 year ago

Hi Chi,

I have a quick question about how to reorder taxa in the abundance bar plots. I see there is a discussion thread that addresses this issue, but microeco has been updated a couple times since then. If I'm not mistaken "input_taxaname = " in the trans_abund$new command is how reordering taxa should be achieved now. While I think it is successfully selecting the taxa I choose, it is still automatically reordering the graph and legend ranked by abundance. I am trying to make a plot of the top 20 ASVs and wish to reorder the graph and legend so that they are grouped together by genus, then I will append the legend with the genus information in another program like illustrator. This way the figure can easily show both changes in ASVs and at the genus level. Here is a sample of my unsuccessful code:

ASV_order_list <- c("ASV_5", "ASV_647", "ASV_76", "ASV_216", "ASV_34", "ASV_11", "ASV_22", "ASV_98", "ASV_134", "ASV_15", "ASV_36", "ASV_41", "ASV_44", "ASV_61", "ASV_172", "ASV_179", "ASV_334", "ASV_528", "ASV_541", "ASV_633")

my_microeco_ds_abund <- trans_abund$new(dataset = my_microeco_ds, taxrank = "ASV", input_taxaname = ASV_order_list)

my_microeco_ds_abund$plot_bar(use_alluvium = TRUE, xtext_type_hor = FALSE, facet = "Water_Depth", facet2 = "Month", x_axis_name ="Site", col= color_palette_20)

I have tried several variations of this, including putting the c-bound list of ASVs directly after "input_taxaname =". My question for you is am I doing this wrong or is it just a bug?

Additionally, if you are able to make a new parameter for trans_abund$plot_bar (and other commands that make plots) that adds flexibility to the legend names, that would be a greatly appreciated quality of life improvement. I'm thinking something like "legend_name_levels = c("ASV", "Genus", "Class") with the output "ASV_123 | Limnohabitans | Gammaproteobacteria", or something along those lines. That would be a bit more useful and powerful for a lot of the figures microeco can make as opposed to using the entire taxonomic string or just one taxa level label. I'm not sure how straight forward this is to do on the coding side of things, and completely understand if your time is better spent on other projects.

As always, I appreciate your help and all the time and energy you put into growing and maintaining the microeco package. It has become a one-stop-shop type of package for microbial ecology and I've been recommending it to anyone who doesn't use it already.

Thanks again! Jake Callaghan

ChiLiubio commented 1 year ago

Hi Jake,

Good question! For the first one, I found there is a bug in the input_taxaname parameter, leading to the order not same with the input_taxaname. I will fix it in the next release. Here is a temporal solution by directly changing data_taxanames in the object.

library(magrittr)
library(microeco)
data(dataset)
# I use example data to show all
d1 <- clone(dataset)
# add ASV level for the example
d1$add_rownames2taxonomy(use_name = "ASV")
d1$cal_abund()
# delete redundant lineages in the rownames to make the operation easier
rownames(d1$taxa_abund$ASV) %<>% gsub(".*\\|", "", .)
tmp <- trans_abund$new(dataset = d1, taxrank = "ASV", ntaxa = 3)
tmp$plot_bar()
# please assign data_taxanames  directly to change the order
tmp$data_taxanames <- c("OTU_34", "OTU_2", "OTU_1")
tmp$plot_bar()

For the second legend question, I have an idea to solve it by changing the tax_table.

d1 <- clone(dataset)
d1$add_rownames2taxonomy(use_name = "ASV")
d1$tax_table %<>% .[, c("Class", "Genus", "ASV"), drop = FALSE]
d1$cal_abund()
tmp <- trans_abund$new(dataset = d1, taxrank = "ASV", ntaxa = 3)
tmp$plot_bar()

However, if you use this way to plot taxa at other level, please remember to change delete_full_prefix parameter. The default delete_full_prefix parameter can delete the prefix "*__" in the legend. As there is no prefix in the "ASV", so it is not necessary in the last part. But when to run "Genus", it is necessary as there is prefix in the corresponding tax_table.

d1 <- clone(dataset)
d1$tax_table %<>% .[, c("Class", "Genus"), drop = FALSE]
d1$cal_abund()
tmp <- trans_abund$new(dataset = d1, taxrank = "Genus", ntaxa = 3, delete_full_prefix = FALSE)
tmp$plot_bar()

Please feel free to tell me if what I write is not clear or can not be applied to your data. Thanks very much for your finding on the bug and for your good suggestion!

Best, Chi

ghost commented 1 year ago

Hi Chi,

Thanks for all of tutorials, it is really a great package! I have a quick question about "facet order" in plot_bar and plot_heatmap. I have five different plant types in my metadata$genotype (sample_table ). When I do facet = "genotype", I see my five plant types at plot; however, I want to change their order. What parameter can I use for trans_abund or trans_abund$plot_bar to change rearrange facet order?

Thank you in advance, Best regards Ilksen

ChiLiubio commented 1 year ago

Hi Ilksen, Thanks. The easist way is to assign factors as the tutorial shows (https://chiliubio.github.io/microeco_tutorial/notes.html#group-order). I use the example data to show one here.

rm(list = ls())
library(microeco)
library(magrittr)
data(dataset)
t1 <- trans_abund$new(dataset = dataset, taxrank = "Phylum", ntaxa = 10)
t1$plot_bar(others_color = "grey70", facet = c("Group", "Type"))
# with factors
dataset$sample_table$Group %<>% factor(levels = c("TW", "CW", "IW"))
t1 <- trans_abund$new(dataset = dataset, taxrank = "Phylum", ntaxa = 10)
t1$plot_bar(others_color = "grey70", facet = c("Group", "Type"))

For multiple facets, please assign factors to each variable, respectively.

Chi

CQMUyan commented 5 months ago

Hello, I have multiple cohorts for which I want to create bar charts. However, I noticed that the top 10 phyla in each cohort are not exactly the same. Therefore, I looked into customizing the order of plotting as you mentioned. But when I specify the phyla for plotting using data_taxanames, the resulting chart only shows "others". This is the erroneous code I used: t1 <- trans_abund$new(dataset = dataset, taxrank = "Phylum", groupmean = "Group") t1$data_taxanames <- c("Firmicutes", "Proteobacteria", "Actinobacteriota", "Bacteroidota", "Patescibacteria","Fusobacteriota") t1$plot_bar() Could I consult you on this issue? image