compomics / searchgui

Highly adaptable common interface for proteomics search and de novo engines
http://compomics.github.io/projects/searchgui.html
38 stars 16 forks source link

number of modifications that could be submittted #125

Closed kapilaGIT closed 7 years ago

kapilaGIT commented 7 years ago

Hi

I know that there is this warning appear as soon if I try to include more than I think 6 modification in the project setup in PeptideShaker. I think the main problem is the computational time, as it could take huge time to do the search. If I really want to look at 20AA based user defined modifications, would this be possible at all or how long you think that this would take.

I forgot to mention my operating system. I run SearchGUI in 32cpu 128 GB RAM vms having windows server 2012

Best regards

Kapila

mvaudel commented 7 years ago

Hi Kapila,

Using many PTMs will dramatically increase the search time and also increase the prevalence of false positives. That said, it is up to you to use many modifications. How long it will take depends on your dataset and database, so I would recommend starting simple and slowly increasing the complexity of the search. I cannot guarantee that it makes sense, though!

Hope this helps,

Marc

kapilaGIT commented 7 years ago

Hi Marc,

For an example say that I want to ask whether all residues could under go a specific modification. I think that, as you said " increase the prevalence of false positives" could occur if one would like to look at number of different modifications, but would that be the case in the example given as well.

I have already performed the searches using an iterative approach, where I started with 7 residues and then dropping the one with least assignments and bringing in new residue to make the total number 7, but my problem is how do you know what is the optimal number that one could use. Do you have an educated guess what should be the number based on your experience.

Best regards

Kapila

kapilaGIT commented 7 years ago

Hi Marc,

Did you see my last reply? What do you think about the questions I had on including number of residues when searching for same modification. One reason for asking this question is that there are difference in number of assignments on a given residue in different combination of residues. I know that by looking at the change in ions and site localisation scores one could have a pretty good idea of the correct assignments, but it seems it is not the case.

Best regards

Kapila

mvaudel commented 7 years ago

Dear Kapila,

Sorry for not answering earlier on this one. If you have a modification that can be on several amino acids, it can become difficult to localize it. You are correct, in many cases there will not be ions allowing the discrimination of the sites and the score will not be informative either... Then there is not much one can do.

Generally we look at all possible yet realistic modification sites. For example with phosphorylation, unless you used a dedicated protocol there is no point in looking for lysine, arginine, and histidine phosphorylation because they would not survive most proteomic protocols. So although they are possible we focus on phosphorylation of serine and threonine, and possibly tyrosine.

Hope it makes sense, if you provide us with more specific examples we will hopefully be able to help more, but I am afraid you are at the border of what mass spectrometry can offer :)

Best regards,

Marc