ChiLiubio / microeco

An R package for data analysis in microbial community ecology
GNU General Public License v3.0
181 stars 55 forks source link

tidy_taxonomy function do not properly perform tax_table clean up #354

Open BiggusDickus666 opened 2 months ago

BiggusDickus666 commented 2 months ago

Hello ChiLiubio, First of all, I would like to thank you for your amazing work in developing microeco package. It has been a relief for us, microbiologists without a solid R programming foundation, to have stumbled upon your creation. I am utilizing tidy_taxonomy to get rid of the unassigned/unknown tax in my tax_table, so they do not show up in my plots, but so far, I have not made it work, and the unassigned taxa keep on showing Here is a snippet of the code I am using:

tidy_taxonomy( MicroEcoDataObject$tax_table, column = "all", pattern = c(".unassigned.", ".uncultur.", ".unknown.", ".unidentif.", ".unclassified.", ".No blast hit.", ".Incertae.sedis."), replacement = "", ignore.case = TRUE, na_fill = "" ) I am afraid I cannot provide the .qza I obtained in QIIME2 for building the Microtable object microeco package uses but I will be happy to provide by email if needed.

Thank you very much in advance

ChiLiubio commented 2 months ago

Hi. Do you mean the function does not work? Please show the full steps that I can judge whether it comes from extra issue. I guess it should be normal if it is properly used, as this function is simple to work. I list the steps for you to check the data.

tmp_raw <- MicroEcoDataObject$tax_table
# please first use the default params to check it
tmp_new <- tidy_taxonomy(tmp_raw)
View(tmp_raw)
View(tmp_new)
BiggusDickus666 commented 2 months ago

Hello, I have tried to insert your suggestion code into mine, but still, unknown taxa keep on appearing when creating a taxa heatmap This is the original workflow I used before your suggestion:

Importing .qza from QIIME2

taxonomy_microeco <- "/home/victor/Documentos/BioinformaticsPisaUbuntu/Replica1RECOVERsoilUMHbioinformatics/DataWO234/Taxonomy/taxonomy.qza" tree_microeco <- "/home/victor/Documentos/BioinformaticsPisaUbuntu/Replica1RECOVERsoilUMHbioinformatics/DataWO234/tree/rooted-tree.qza" table_microeco <- "/home/victor/Documentos/BioinformaticsPisaUbuntu/Replica1RECOVERsoilUMHbioinformatics/DataWO234/dirDADA2/table.qza" rep_microeco <- "/home/victor/Documentos/BioinformaticsPisaUbuntu/Replica1RECOVERsoilUMHbioinformatics/DataWO234/dirDADA2/representative-sequences.qza"

Creating the data frame for metadata

metadata2microeco <- data.frame( SampleID = c("S10-16S", "S11-16S", "S9-16S", "S12-16S", "S1-16S", "S8-16S", "S5-16S", "S6-16S", "S7-16S"), Time = c("T60", "T60", "T60", "T60", "T0", "T60", "T60", "T60", "T60"), Group = c("Inoculum", "Inoculum", "Inoculum", "Inoculum", "No Inoculum", "No Inoculum", "No Inoculum", "No Inoculum", "No Inoculum"), Type = c("LDPE", "LLDPE", "No plastic", "FILM", "No plastic", "FILM", "No plastic", "LDPE", "LLDPE")

Using file2meco package to create Microtable Object

MicroEcoDataObject <- qiime2meco(table_microeco, sample_table = metadata2microeco, taxonomy_table = taxonomy_microeco, phylo_tree = tree_microeco, rep_fasta = rep_microeco, auto_tidy = TRUE) MicroEcoDataObject

Filtering taxa_table using tidy_taxonomy function

tidy_taxonomy( MicroEcoDataObject$tax_table, column = "all", pattern = c(".unassigned.", ".uncultur.", ".unknown.", ".unidentif.", ".unclassified.", ".No blast hit.", ".Incertae.sedis."), replacement = "", ignore.case = TRUE, na_fill = "" )

Creating taxa heatmap

TaxaHeatmap <- trans_abund$new(dataset = MicroEcoDataObject, taxrank = "Genus", ntaxa = 40) TaxaHeatmap$plot_heatmap(facet = c("Type","Group"), xtext_keep = FALSE, withmargin = FALSE, plot_breaks = c(0.01, 0.1, 1, 10))

This is the heatmap. As you may see, weird taxa assignation like 67-14 or MB-A2-108 keep on appearing. I do not know if tidy_taxonomy is equipped to remove these labels

TaxaHeatmapMicro.pdf

Thank you very much in advance

ChiLiubio commented 2 months ago

Hi. First you should assign back to MicroEcoDataObject$tax_table when using tidy_taxonomy.

MicroEcoDataObject$tax_table <- tidy_taxonomy(MicroEcoDataObject$tax_table)

Second, those taxa like 67-14 or MB-A2-108 are not the generally so-called useless features. The function cannot filter them with the default parameters. If you want to filter all those taxa with numbers, you can add the item in the patten parameter like this 图片

BiggusDickus666 commented 2 months ago

Problem solved!! Thank you very much @ChiLiubio