arnesmits / DEP

DEP package
26 stars 12 forks source link

Different p-value in each independent run? #33

Open Neo-xbx-00 opened 1 year ago

Neo-xbx-00 commented 1 year ago

I use test_diff to conduct Differential enrichment analysis, I had impute my data using MLE methods in advance. I found for the same protein, the p-values derived from different runs were divergent from each other, as well as the numbers of significant proteins. Is it normal? The figures demonstrate part of the results from two runs, one contains 19 differentially expressed proteins while the other contains only 3. image image

adomingues commented 1 year ago

Did anything change in between those two runs, or was it literally just running the function test_diff consecutively on the same object?

Neo-xbx-00 commented 1 year ago

Did anything change in between those two runs, or was it literally just running the function test_diff consecutively on the same object?

No changes in between these two runs. I literally just run the function test_diff consecutively on the same data frame. I found every time I run, the results are different.

Neo-xbx-00 commented 1 year ago

library("DEP")

Prepare my data from maxquant.

df_protein <- read.table("proteinGroups.txt",sep = "\t",header = T) %>% filter(Reverse != "+",Potential.contaminant != "+",Only.identified.by.site != "+", Score > 20,Unique.peptides > 1) %>% select(2,64:69,81) df_experiment <- read.table("ExperimentalDesign.txt",sep = "\t",header = T) colnames(df_protein) df_protein$id %>% duplicated() %>% any() df_protein_unique <- make_unique(df_protein,"id","Majority.protein.IDs",delim = ";") colnames(df_protein_unique) df_protein_unique$name %>% duplicated() %>% any() df_LFQ <- grep("LFQ.", colnames(df_protein_unique)) df_se <- make_se(df_protein_unique, df_LFQ, df_experiment)

df_missval <- filter_missval(df_se, thr = 1)

df_norm <- normalize_vsn(df_missval)

df_imp_MLE <- DEP::impute(df_norm, fun = "MLE")

df_diff_all_contrasts <- test_diff(df_imp_MLE, type = "control", control = "FB") df_diff_all_results <- get_df_wide(df_diff_all_contrasts)

Denote significant proteins based on user defined cutoffs

df_DEP <- add_rejections(df_diff_all_contrasts, alpha = 0.05, lfc = log2(2))

df_DEP_results <- get_results(df_DEP)

df_DEP_results %>% filter(significant=="TRUE") %>% nrow()

Neo-xbx-00 commented 1 year ago

library("DEP") #Prepare my data from maxquant. df_protein <- read.table("proteinGroups.txt",sep = "\t",header = T) %>% filter(Reverse != "+",Potential.contaminant != "+",Only.identified.by.site != "+", Score > 20,Unique.peptides > 1) %>% select(2,64:69,81) df_experiment <- read.table("ExperimentalDesign.txt",sep = "\t",header = T) colnames(df_protein) df_protein$id %>% duplicated() %>% any() df_protein_unique <- make_unique(df_protein,"id","Majority.protein.IDs",delim = ";") colnames(df_protein_unique) df_protein_unique$name %>% duplicated() %>% any() df_LFQ <- grep("LFQ.", colnames(df_protein_unique)) df_se <- make_se(df_protein_unique, df_LFQ, df_experiment)

df_missval <- filter_missval(df_se, thr = 1)

df_norm <- normalize_vsn(df_missval)

df_imp_MLE <- DEP::impute(df_norm, fun = "MLE")

df_diff_all_contrasts <- test_diff(df_imp_MLE, type = "control", control = "FB") df_diff_all_results <- get_df_wide(df_diff_all_contrasts)

Denote significant proteins based on user defined cutoffs

df_DEP <- add_rejections(df_diff_all_contrasts, alpha = 0.05, lfc = log2(2))

df_DEP_results <- get_results(df_DEP)

df_DEP_results %>% filter(significant=="TRUE") %>% nrow()

This is my R script, everytime I will get a totally different result. It is quite weird. Same things also take place in example data.