Open Mrudhulaks opened 4 years ago
Did you solve this?
I was facing the same problem - i.e. that tax_glom()
was not "merging"/aggregating taxa (as it says in the documentation it should), but rather it is just subsetting my data.
I.e. if I applied gen <- tax_glom(ps16, taxrank="genus")
I would get a dataset containing only ASVs which have been assigned to genera, but not a summed data table.
I also tried with aggregate_taxa()
from the "microbiome" R package.
By running taxa_names(gen)[1:2]
I identified the problem since it would return this:
TAACACGTAGGGCGCGAGCGTTGTCCGGAATTATTGGGCGTAAAGAGCTCGTAGGTGGTTTGCTACGTCCGCTGTGAAAACCTAGGGCTTAACCCTGGGCTTGCAGTGGATACGGACAGACTAGAGGTAGGTAGGGGAGAATGGAATTCCCGGTGTAGCGGTGAAATGCGCAGATATCGGGAGGAACACCAGTGGCGAAGGCGGTTACCTGGTCCTGCACTGACGCTGATGCACGAAAGCTGGGGGAGCAAACGGGATTd:Bacteria(1.0000),p:Actinobacteria(0.9500),c:Actinobacteria(0.9025),o:Actinomycetales(0.8484),f:Thermomonosporaceae(0.5514),g:Actinoallomurus(0.2316)+_Bacteria_Actinobacteria_Actinobacteria_Actinomycetales__
""
TAACACGTAGGGCGCGAGCGTTGTCCGGAATTATTGGGCGTAAAGAGCTCGTAGGTGGTTTGCTACGTCCGCTGTGAAAACCTAGGGCTTAACCCTGGGCTTGCAGTGGATACGGACAGACTAGAGGTAGGTAGGGGAGAATGGAATTCCCGGTGTAGCGGTGAAATGCGCAGATATCGGGAGGAACACCGGTGGCGAAGGCGGTTCTCTGGGCCTTACCTGACACTGAGGAGCGAAAGCGTGGGGAGCGAACAGGATTd:Bacteria(1.0000),p:Actinobacteria(1.0000),c:Actinobacteria(1.0000),o:Actinomycetales(1.0000),f:Thermomonosporaceae(0.8700),g:Actinoallomurus(0.6960)+_Bacteria_Actinobacteria_Actinobacteria_ActinomycetalesThermomonosporaceae ""
This is because when putting together my phyloseq object I saved the two first colums of my taxonomic assignment - which is the sequence itself and the "match-score-string", i.e. a structure like this:
seq 1 TAACACCGGCAGCTCAAGTGGTGGCCATTATTATTGGGCCTAAAGCGTTCGTAGCCGGTTTGATAAGTCTCTGGTGAAATCCCGCAGCTTAACTGTGGGACTTGCTGGAGATACTATTAGACTTGAGGTCGGGAGAGGTTAGGGGTACTCCCAGGGTAGGGGTGAAATCCTATAATCCTGGGAGGACCACCTGTGGCGAAGGCGCCTAACTGGAACGAACCTGACGGTGAGTAACGAAAGCCAGGGGCGCGAACCGGATT 2 TAATACCTGCAGCCCAAGTGGTGGTCGATTTTATTGAGTCTAAAACGTTCGTAGCCGGTCTGATAAATCCTTGGGTAAATCGGAAAGCTTAACTTTCCGAATTCCGAGGAGACTGTCAGACTTGGGACCGGGAGAGGCTAGAGGTACTTCTGGGGTAGGGGTAAAATCCTGTAATCCTAGAAGGACCACCGGTGGCGAAGGCGTCTAGCTAGAACGGATCCGACGGTGAGGGACGAAGCCCTGGGTCGCAAACGGGATT string 1 d:Archaea(1.0000),p:"Euryarchaeota"(1.0000),c:Methanobacteria(1.0000),o:Methanobacteriales(1.0000),f:Methanobacteriaceae(1.0000),g:Methanobacterium(1.0000) 2 d:Archaea(1.0000),p:"Euryarchaeota"(1.0000),c:Thermoplasmata(0.9900),o:Methanomassiliicoccales(0.9801),f:Methanomassiliicoccaceae(0.9703),g:Methanomassiliicoccus(0.9606) sep domian phyla class order family genus 1 + Archaea Euryarchaeota Methanobacteria Methanobacteriales Methanobacteriaceae Methanobacterium 2 + Archaea Euryarchaeota Thermoplasmata Methanomassiliicoccales Methanomassiliicoccaceae Methanomassiliicoccus
By deleting these first columns and thus keeping only those with proper trimmed taxonomy, I now get taxa_names(gen)[5:10]
taxa_names(gen)[5:10] [1] "Methanomassiliicoccus"
[2] "Archaea_Thaumarchaeota__"
[3] "Archaea_Thaumarchaeota_Thaumarchaeota_o:Nitrososphaerales"
[4] "Archaea_Thaumarchaeota_Thaumarchaeota_o:Nitrososphaeralesf:Nitrososphaeraceae" [5] "g:Nitrososphaera"
[6] "Bacteria_____"So the take-home-message is that the fuction
tax_glom()
(andaggregate_taxa()
) is aggregating across the whole _taxtable() and before each line would be unique, thus no aggregation was performed.
I hope this explanation makes sense!
Hi,
I am trying to apply tax_glom function on my phyloseq object. But looks like it is not working.
The dimensions of the file I am working on is 768 OTU and 201 samples
The code I used: OTU.Phylum = tax_glom(physeq, taxrank = "Phylum", NArm = FALSE).
The dimensions after performing this function is same as before.
So, I tried the code at different tax level.
OTU.genus = tax_glom(physeq, "Genus")
However, the results from this is also the same.
Could you please help me resolve this.
Thank you.