joey711 / phyloseq

phyloseq is a set of classes, wrappers, and tools (in R) to make it easier to import, store, and analyze phylogenetic sequencing data; and to reproducibly share that data and analysis with others. See the phyloseq front page:
http://joey711.github.io/phyloseq/
582 stars 187 forks source link

Finding Unique Taxa #859

Open kholt1992 opened 6 years ago

kholt1992 commented 6 years ago

Hi,

I want to subset my phyloseq object by two groups of samples in my mapping file (5 samples per group), and find taxa found in at least two samples for one group, and found in none of the samples in the other group. Does anyone know how to do this?

Thanks, Kevin

seashore001x commented 6 years ago

1 use tax_glom() to agglomerate taxa of the same type. 2 extract the otu_table with taxa information from phyloseq object 3 use filter() from dplyr package to subset your otu_table based on your group 4 extract the taxa information (column names) from subset otu_table and compare the two subsets using match() or %in%

Marieag commented 5 years ago

I'm trying to do something similar - I need to find out, if I have any pathogens only present in certain tissue groups in my dataset.

I've merged my dataset into the three sample groups, sample_data(physeq)[,7] Asym_arm_asym_plant Asym_arm_sym_plant Sym_arm

I'd like to find out which OTUs are present in ONLY one of these three groups, but I can't seem to figure out how to construct the syntax for filtering my OTUs. I've tried something like this:

physeq %>% filter_taxa(sample_data(physeq)[,7] == "Asym_arm_asym_plant" & sample_data(physeq)[,7] != "Asym_arm_sym_plant" & sample_data(physeq)[,7] != "Sym_arm") and

sym_arm <- filter(otu_table, Asym_arm_asym_plant == 0, Asym_arm_sym_plant == 0, Sym_arm > 0 )

but those just give me errors. I'm pretty sure it's just syntax errors, but I'm not super R-savvy, so yeah. Can you guys give me a pointer in the right direction? Thanks.