vdemichev / DiaNN

DIA-NN - a universal automated software suite for DIA proteomics data analysis.
Other
274 stars 53 forks source link

LFQbench with DIA-NN results #69

Closed MarcIsak closed 3 years ago

MarcIsak commented 3 years ago

Hi Vadim,

I have DIA-runs from 2 hybrid-proteome samples (different but known ratios of Yeast and Mouse peptides), that I plan on running through DIA-NN. I have created different spectral libraries that I would like to use for the analysis. In order to benchmark which is the better spectral library, I would like to use the LFQbench software.

Unfortunately, it seems like LFQBench does not yet include DIA-NN reports. Still, I have seen LFQbench results from DIA-NN analyses on Twitter. I just wonder if you may share a script or similar that allows me to run LFQbench on the DIA-NN reports?

Thank you in advance, and thank you for creating such a high performance DIA software!

Best,

Marc

vdemichev commented 3 years ago

Hi Marc,

I don't recommend using the original LFQbench R package for LFQbench-bencmarks. It's a bit non-transparent how it does things. Writing the code on your own would be quicker (it's very easy), more reliable and will allow you to look at different kinds of performance metrics. If you do want to use LFQbench anyway, you only need to configure it so it recognises column names in DIA-NN report (there was a tutorial on how to do this I think). Or, alternatively, rename the columns in DIA-NN report to match the report format of Spectronaut.

I have some in-house scripts, below is one example (it's not meant to be run, just a reference on how things like that can be processed). Hope this helps!

Best wishes,

Vadim

library(diann) # https://github.com/vdemichev/diann-rpackage library(scales) library(data.table) library(reshape2) setwd('C:/Temp')

pal <- c('springgreen3','chocolate3','royalblue') ALPHA = 0.2 cex = 0.5 MIN_PEPS <- 2 UNIQUE <- T

data <- diann_load('hye124_analyse_robust.tsv')

for (MQ in c(0.01, 0.005, 0.001)) { q <- MQ pgq <- MQ libq <- MQ df <- data[which(data$Q.Value <= q & data$PG.Q.Value <= pgq & data$Lib.Q.Value <= libq),] if (UNIQUE) df <- df[!grepl(';',df$Protein.Names),] pr <- diann_matrix(df) ids <- colSums(!is.na(pr)) print(ids)

pg <- diann_maxlfq(df)

A <- grepl('A',colnames(pg)) B <- grepl('B',colnames(pg)) human <- grepl('HUMAN',rownames(pg)) & !grepl('YEAST',rownames(pg)) & !grepl('ECOLI',rownames(pg)) yeast <- !grepl('HUMAN',rownames(pg)) & grepl('YEAST',rownames(pg)) & !grepl('ECOLI',rownames(pg)) ecoli <- !grepl('HUMAN',rownames(pg)) & !grepl('YEAST',rownames(pg)) & grepl('ECOLI',rownames(pg)) type <- rep(NA,nrow(pg)) type[human] <- 'human' type[yeast] <- 'yeast' type[ecoli] <- 'ecoli' type <- factor(type, levels = c('human','yeast','ecoli'))

key <- paste0(df$Stripped.Sequence, df$File.Name) x <- df[!duplicated(key),] peps <- aggregate(x[,c('File.Name','Protein.Names','Stripped.Sequence')], by = list(x$File.Name,x$Protein.Names), FUN = length) peps <- dcast(peps, Group.2 ~ Group.1, value.var = 'Stripped.Sequence') rownames(peps) <- peps$Group.2 peps$Group.2 <- NULL

x <- pg peps <- peps[rownames(x), colnames(x)] x[which(peps < MIN_PEPS)] <- NA

pgA <- log2(rowMeans(x[,A],na.rm=T)) pgB <- log2(rowMeans(x[,B],na.rm=T)) pgR <- pgA - pgB print(paste0('MQ = ',MQ,': Valid ratios: ',length(which(is.finite(pgR))),' for ',nrow(x[rowSums(!is.na(x)) > 0,]),' proteins'))

sel <- is.finite(pgR) & !is.na(type) pdf(paste0('hye124 - ',MQ,'.pdf')) par(mfrow = c(2,2)) plot(pgR[human] ~ pgB[human], pch = 16, col = alpha(pal[1], ALPHA), xlim = range(2,16), ylim = range(-3,2), cex = cex, xlab = 'log2(B)', ylab = 'log2(A/B)') points(pgR[yeast] ~ pgB[yeast], pch = 16, col = alpha(pal[2], ALPHA), cex = cex) points(pgR[ecoli] ~ pgB[ecoli], pch = 16, col = alpha(pal[3], ALPHA), cex = cex) abline(h = 1.0, col = alpha('black',0.3)) abline(h = 0.0, col = alpha('black',0.3)) abline(h = -2.0, col = alpha('black',0.3)) boxplot(pgR ~ type, col = pal, outline = F, ylim = range(-3,2), xlab = '', ylab = '') abline(h = 1.0, col = alpha('black',0.3)) abline(h = 0.0, col = alpha('black',0.3)) abline(h = -2.0, col = alpha('black',0.3)) dev.off() }

MarcIsak commented 3 years ago

Thank you for the insanely quick response. I'll let you know soon if I can make it work.

Have a nice day!

Marc