joey711 / phyloseq

phyloseq is a set of classes, wrappers, and tools (in R) to make it easier to import, store, and analyze phylogenetic sequencing data; and to reproducibly share that data and analysis with others. See the phyloseq front page:
http://joey711.github.io/phyloseq/
568 stars 187 forks source link

Phyloseq to DESeq2 for 16s amplicon #1095

Open bioinfonext opened 5 years ago

bioinfonext commented 5 years ago

I am trying to use DESeq2 for amplicon data analysis, but I am getting some problem.

Details of the analysis:

here is the summary of data, I do have soil, root, and shoot sample collected from different villages.

and we have given three treatment to each soil, root and shoot sample in each village: TREATMENT T1, T2 AND T3.

Now, the aim is to find which microbial species or genus are more abundant in Treatment T1 and Treatment T3 in comparison to T2 and how T3 microbial abundance is different from T1 Treatment.

I tried to pass the phyloseq object into DESeq2

Phyloseq to Deseq

psfdds<-phyloseqtodeseq2(ps0, ~1) # I just pass 1 as design here, because at later step I am going to pass again group as design.

Group Tissue and Tremant variable into one varibale

psfdds$group <- factor(paste0(psfdds$Tissue,psfdds$Treatment))

group used as design again

design(psfdds) <- ~ group

Deseq

psfdds <- DESeq(psfdds, test="Wald", fitType="parametric")
estimating size factors
estimating dispersions
gene-wise dispersion estimates
mean-dispersion relationship
final dispersion estimates
fitting model and testing
-- replacing outliers and refitting for 10886 genes
-- DESeq argument 'minReplicatesForReplace' = 7
-- original counts are preserved in counts(dds)
estimating dispersions
fitting model and testing

resultsNames(psfdds) [1] "Intercept" "group_LeafT2_vs_LeafT1" "group_LeafT3_vs_LeafT1" [4] "group_RootT1_vs_LeafT1" "group_RootT2_vs_LeafT1" "group_RootT3_vs_LeafT1" [7] "group_SoilT1_vs_LeafT1" "group_SoilT2_vs_LeafT1" "group_SoilT3_vs_LeafT1"

Now extracting result for soil in Treatment1_vs_Treatment2

res_soil.T1_soil.T2_new <- results(psfdds, contrasts=c("group","TissueSoil.TreatmentT1","TissueSoil.TreatmentT2"))

but when I look at summary of result of any comparison it give same value as shown below

summary(ressoil.T1_soil.T2_new) out of 59297 with nonzero total read count adjusted p-value < 0.1 LFC > 0 (up) : 58684, 99% LFC < 0 (down) : 3, 0.0051% outliers [1] : 0, 0% low counts [2] : 3004, 5.1% (mean count < 0) But it seems like there is some problem as it is showing the same result after passing anything.

Could you please suggest how can I resolve it.

I will be very thankful to you.

MSMortensen commented 5 years ago

Hi Yogeshgupt,

I am not an expert in using DeSeq, but from the information available it seems like there might be a problem when you extract your results. The resultsNames are styled as "group_SoilT1_SoilT2" which would indicate that you use the wrong parameter when extracting the results. Try using this command instead: res_soil.T1_soil.T2_new <- results(psfdds, contrasts=c("group", "SoilT1", "SoilT2SoilT2"))

Kind regards, Martin