mwalzer / psi-pi

Automatically exported from code.google.com/p/psi-pi
0 stars 0 forks source link

PeptideEvidenceList/EnzymeRefs #60

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
It was briefly discussed on the call that the current 0 or more EnzymeRefs does 
not make it clear whether every PeptideEvidence of a PeptideEvidenceList is due 
to all of the enzymes or if it can be due to just one. Where do semi-specific 
peptides go? Where do non-specific peptides go?

For independent multiple enzymes, each PeptideEvidenceList should have only one 
EnzymeRef?

For non-independent multiple enzymes, each PeptideEvidence may come from one of 
three enzyme combinations: enzyme 1, enzyme 2, enzyme 1+2. With more enzymes 
the number of combinations increases.

Actually, since each PeptideEvidence can only come from a maximum of two 
enzymes, EnzymeRef should really be 1 or 2, not unbounded. I'm in favor of 
EnzymeRef being specific to a terminus. Perhaps two different elements: 
NTerminusEnzymeRef and CTerminusEnzymeRef for each PeptideEvidenceList.

Original issue reported on code.google.com by matt.cha...@gmail.com on 7 Apr 2011 at 4:12

GoogleCodeExporter commented 9 years ago
We can also move EnzymeRef to be attributes of PeptideEvidenceList for a 
slightly cleaner representation IMO. And then I make the references specific to 
each terminus to dirty it up again. ;)

One list of fully tryptic peptides:
<PeptideEvidenceList nTerminusEnzyme_ref="EZ_TRYP" 
cTerminusEnzyme_ref="EZ_TRYP">

Two lists of semitryptic peptides:
<PeptideEvidenceList nTerminusEnzyme_ref="EZ_TRYP">
<PeptideEvidenceList cTerminusEnzyme_ref="EZ_TRYP">

One list of nontryptic peptides:
<PeptideEvidenceList>

For two non-independent enzymes there are two fully specific lists:
<PeptideEvidenceList nTerminusEnzyme_ref="EZ_LYSC" 
cTerminusEnzyme_ref="EZ_ARGC">
<PeptideEvidenceList nTerminusEnzyme_ref="EZ_ARGC" 
cTerminusEnzyme_ref="EZ_LYSC">

There are four semi-specific lists:
<PeptideEvidenceList nTerminusEnzyme_ref="EZ_LYSC">
<PeptideEvidenceList nTerminusEnzyme_ref="EZ_ARGC">
<PeptideEvidenceList cTerminusEnzyme_ref="EZ_LYSC">
<PeptideEvidenceList cTerminusEnzyme_ref="EZ_ARGC">

But again just a single non-specific list.

Adding a third enzyme makes it really bad but I hope the pattern is clear. This 
certainly makes it unambiguous what each list represents!

Original comment by matt.cha...@gmail.com on 7 Apr 2011 at 4:24

GoogleCodeExporter commented 9 years ago
Heidelberg: 
- seems to be not necessary to repeat the search or to report the results or 
judge their quality
- Furthermore it seems to be a rather rare case and may be solved when it comes 
up

Original comment by eisena...@googlemail.com on 12 Apr 2011 at 12:22

GoogleCodeExporter commented 9 years ago
OK so that seems to shoot down my suggestions without answering any of my 
questions. If we're not going to be unambiguous with the enzyme information, 
why include it at all? We could just go back to having 0 or more 
PeptideEvidence elements instead of PeptideEvidenceLists. And of course remove 
missedCleavages again.

Original comment by matt.cha...@gmail.com on 12 Apr 2011 at 1:24

GoogleCodeExporter commented 9 years ago
Agreement TeleCon 21.4.2011: It was of course not intended to shoot down your 
suggestions without answering questions or giving arguments, sorry for that. It 
was again discussed in the TeleCon, that grouping the <PeptideEvidence> 
elements makes sense to give missedCleavages and the other attributes their 
appropriate relation point. So the <EnzymeRef> will remain an element as it is. 
A cvParam sub-Element will be added to <EnzymeRef> for the description of 
terminal specificity.

action points: 1) I will change the schema accordingly; 2) I will send David 
and Juan-Antonio two new CV terms: "Enzyme specificity N-term" and "Enzyme 
specificity C-term".

Original comment by eisena...@googlemail.com on 21 Apr 2011 at 4:50

GoogleCodeExporter commented 9 years ago
If I understand correctly, the resulting syntax would look like this for a 
semi-tryptic search:
<PeptideEvidenceList>
  <PeptideEvidence ...>
  ...
  <EnzymeRef ref="EZ_TRYP">
     <cvParam name="N-terminal enzyme specificity"/>
     <cvParam name="C-terminal enzyme specificity"/>
  </EnzymeRef>
</PeptideEvidenceList>
<PeptideEvidenceList>
  <PeptideEvidence ...>
  ...
  <EnzymeRef ref="EZ_TRYP">
     <cvParam name="N-terminal enzyme specificity"/>
  </EnzymeRef>
</PeptideEvidenceList>
<PeptideEvidenceList>
  <PeptideEvidence ...>
  ...
  <EnzymeRef ref="EZ_TRYP">
     <cvParam name="C-terminal enzyme specificity"/>
  </EnzymeRef>
</PeptideEvidenceList>

What about the non-specific case? No EnzymeRef?

Original comment by matt.cha...@gmail.com on 21 Apr 2011 at 5:00