yafeng / DEqMS

DEqMS is a tool for quantitative proteomic analysis
20 stars 2 forks source link

Error running DEqMS #11

Open HYTang69 opened 1 year ago

HYTang69 commented 1 year ago

Hello, I am new to DEqMS. I am trying to analyze MaxQuant data. I was able to follow the vignettes instructions to process the PXD000279 data. However, when I load my own proteinGroups.txt, I encountered an error running the command:

pep.count.table = data.frame(count = rowMins(as.matrix(df.prot[,19:24])), row.names = df.prot$Majority.protein.IDs)

Error in rowMins(as.matrix(df.prot[, 19:24])) : Argument 'x' cannot be logical.

I wonder if you have any idea what caused this error.

Thank you for your time!

Hsin-Yao

yafeng commented 1 year ago

Hi, you need to make some changes. First, to get the protein intensity table, you need update the numbers in following code shown in the vignettes

df.LFQ = df.prot[,89:94]
df.LFQ[df.LFQ==0] <- NA

changed to

# Extract columns of LFQ intensites
quant_cols <- grep("LFQ",colnames(df.prot)) # to find the column number of LFQ quant value in your input
df.LFQ <- df.prot[,quant_cols]
df.LFQ[df.LFQ==0] <- NA

Second, you need to update the codes in this line pep.count.table = data.frame(count = rowMins(as.matrix(df.prot[,19:24])), changed to

# count unique+razor peptides used for quantification
pep_count_cols = grep("Razor...unique.peptides.",fixed = T, colnames(df.prot))  # to find column number of peptide count in your input
pep.count.table = data.frame(count = rowMins(as.matrix(df.prot[,pep_count_cols])),
                             row.names = df.prot$Majority.protein.IDs)

Yafeng

HYTang69 commented 1 year ago

Thanks for your prompt reply! I did change the column positions when I first ran the script. And I also tried the modifications you suggested, but still getting the same error. Below is what I ran. I can also provide the txt file if needed.

library(DEqMS)

df.prot = read.table("proteinGroups.txt",header=T,sep="\t",stringsAsFactors = F, comment.char = "",quote ="")

df.prot = df.prot[!df.prot$Reverse=="+",] df.prot = df.prot[!df.prot$Contaminant=="+",]

quant_cols <- grep("LFQ",colnames(df.prot)) # to find the column number of LFQ quant value in your input df.LFQ <- df.prot[,quant_cols] df.LFQ[df.LFQ==0] <- NA

rownames(df.LFQ) = df.prot$Majority.protein.IDs df.LFQ$na_count_KO = apply(df.LFQ,1,function(x) sum(is.na(x[1:3]))) df.LFQ$na_count_WT = apply(df.LFQ,1,function(x) sum(is.na(x[4:6])))

df.LFQ.filter = df.LFQ[df.LFQ$na_count_KO<2 & df.LFQ$na_count_WT<2,1:6]

library(matrixStats)

pep_count_cols = grep("Razor...unique.peptides.",fixed = T, colnames(df.prot)) pep.count.table = data.frame(count = rowMins(as.matrix(df.prot[,pep_count_cols])), row.names = df.prot$Majority.protein.IDs)

yafeng commented 1 year ago

I will have a look, please send the text file input.

HYTang69 commented 1 year ago

proteinGroups.txt

Thank you. I have dropped the file here.

yafeng commented 1 year ago

Hi, I see what the problem is now. The input file has a different header "Potential.contaminant" for contaminant proteins. The default column "df.prot$Contaminant" doesn't exist.

you need to change the following code

df.prot = df.prot[!df.prot$Contaminant=="+",]

to df.prot = df.prot[!df.prot$Potential.contaminant=="+",]

It appears you are beginner in R language, I want to warn you another line of code.

df.LFQ$na_count_KO = apply(df.LFQ,1,function(x) sum(is.na(x[1:3])))
df.LFQ$na_count_WT = apply(df.LFQ,1,function(x) sum(is.na(x[4:6])))

Please make sure your KO samples are in the first three columns, and the WT samples are in column 4 to 6 in the df.LFQ data frame. It won't make error, but you can't get expected results if you make mistakes here.

Good luck!

Yafeng .

HYTang69 commented 1 year ago

Thanks. Definitely an R beginner here. I was trying to run my dataset through DEqMS to compare with the results I obtain with my standard workflow. That error is now gone, but now min(fit3$count) return a value of 0. Not sure how to troubleshoot...

yafeng commented 1 year ago

add a pseudocount

fit3$count = fit3$count + 1