joey711 / phyloseq

phyloseq is a set of classes, wrappers, and tools (in R) to make it easier to import, store, and analyze phylogenetic sequencing data; and to reproducibly share that data and analysis with others. See the phyloseq front page:
http://joey711.github.io/phyloseq/
582 stars 187 forks source link

Find shared OTUs between two phyloseq objects? #1387

Open pavlo888 opened 4 years ago

pavlo888 commented 4 years ago

Hi,

I was wondering if it is possible to find the shared OTUs between two phyloseq objects, i.e. two distinct OTU tables?

Cheers, Pablo

bathyscapher commented 4 years ago

Yes, this is possible indeed: If the two phyloseq objects stem from another, but the same phyloseq object (e.g. subsetted by treatment, habitat etc.), then call:

## Subset by treatment
ps1 <- prune_samples(ps@sam_data$Treatment == "Treatment1", ps)
ps2 <- prune_samples(ps@sam_data$Treatment == "Treatment2", ps)

## Get vectors of numbered OTUs/ASVs
treatment1 <- colnames(otu_table(ps1))
treatment2 <- colnames(otu_table(ps2))

## Get the intersection
shared <- intersect(treatment1, treatment2)

## Subset phyloseq object to shared taxa
ps.s <- subset(otu_table(ps), select = colnames(otu_table(ps)) %in% shared)

If the two phyloseq objects deviate in numbering of their taxonomy, renumber one of them based on the numbers of the other (hard to demonstrate without example data).

pavlo888 commented 4 years ago

Hi bathyscapher,

Thank you for your reply. I am actually interested in checking the shared OTUs between two different phyloseq objects coming from two different microbiomes.

Should I first merge the objects and then prune them based on the microbiome of origin and then check for the intersection?

Cheers, Pablo

bathyscapher commented 4 years ago

One problem that comes with merge_phyloseq is that it operates on the number of the OTUs (the rownames of the internal data.frame of the S4 object, if I remember correctly).

If you only need the intersect of the taxa, one possiblity is: extract, for both phyloseq objects, the taxonomy table. Then you can run intersect on the taxonomy part. Maybe this needs some concatenation of some cells, depending on your actual taxa (for instance, if you have a lot of unclassified genera, take a combination of Phylum + Class + Genus). Make sure, there are no empty taxa. If so, remove them either in the phyloseq object or in the data.frame (whatever suits you better).

There might be a more direct apprach though...

If you need to manipulate the phyloseq object further, I suggest you provide some example data, that makes it easier to provide help :)

pavlo888 commented 3 years ago

Hi there,

So I managed to merge my two phyloseq objects without including the phylogenetic trees with the following function: phy_merge <- merge_phyloseq(physeqw, physeqr)

phyloseq-class experiment-level object
otu_table()   OTU Table:         [ 11068 taxa and 528 samples ]
sample_data() Sample Data:       [ 528 samples by 15 sample variables ]
tax_table()   Taxonomy Table:    [ 11068 taxa by 7 taxonomic ranks ]

Now, I would like to see which taxa are shared based on a metadata column named "Matrix" that has two levels: Rockwool and Water. How could I proceed with this?

Thank you in advance for your help!

Cheers, Pablo

Madegwa commented 3 years ago

Hello thanks for your assistance i was also interested in the same analysis. Is it possible to extract the OTU table that is unique to each treatment? I have two seasons as treatments (Summer and Spring). I now know how to get information for the intersection. But i am also interested in the OTUs that are not shared (not in the intersection ) but unique to Summer alone and Spring alone after removing shared OTUs (so only unique to each region). Is this possible? Any assistance will be highly appreciated.