Open GoogleCodeExporter opened 9 years ago
One extra note - I would like to discuss this briefly at the PSI 2013 meeting.
We need to decide:
- Whether this is allowed at all (I vote yes, so long as protein inference does
not need to be repeated)
- Whether combined results should be placed in one SIList or in multiple
SILists (not sure I have a preference on this one at the moment - we need to
check the specification carefully for what an SIList is supposed to represent)
Original comment by andrewro...@googlemail.com
on 5 Mar 2013 at 2:46
Discussed at PSI2013
Summary:
Pre-fractionation results can be combined in mzIdentML. Different use cases,
e.g. 10 fractions producing 10 MGF files:
- Search engine combines MGF files before searching together / protein
inference together --> Then encode as one mzid file, with one SIList, one
ProteinList
- If search engine does 10 searches, produces 10 mzid files. Post-processing
software takes these and combines, then does protein inference. Strong
preference for final output file to combine the results into one single SIList,
one protein list. Alternative encoding to produce 10 SILists is a pain for
reading/viewing software.
ACTION: Update specification document with this implementation recommendation,
it cannot obviously be enforced.
Further note on multiple search engine results. Preference is also to report
final results in one single list, showing combined scores. Reading software
does not easily deal with multiple lists - would have to do own combination and
re-ranking of results.
General principle - each spectrum SHOULD only be reported once per file.
ACTION: Update to spec doc with this recommendation.
Also discussed CID/ETD - these SHOULD be reported in separate SILists, since
protocol and everything else about search is different.
ACTION: Update to spec doc with this recommendation.
Original comment by andrewro...@googlemail.com
on 17 Apr 2013 at 1:40
File attached showing the recommended encoding for fractions (hand-crafted and
incomplete)
Original comment by andrewro...@googlemail.com
on 16 May 2013 at 2:39
2nd attempt to attach the file
Original comment by andrewro...@googlemail.com
on 16 May 2013 at 2:40
Attachments:
Here are the minutes from last week's call repeated here:
****
- How to represent fractions in mzIdentML?
- Would it be possible to represent the multiple runs and the collapsed view in
the same file? In principle, the preferred way would be to allow only the
combined results (from the different fractions/runs) but not the invividual
results in the same file.
Eric commented that if only the collapsed view was possible it would not be
possible to convert in both directions mzIdentML and pep.xml files.
Possible solutions to allow this:
- Use CV param to say if one list of PSMs is the final result or not
(true/false)
- Use CV params to name specific fractions.
No solution yet: it needs to be discussed further.
*****
Further to this, I think we should make it easy to convert back and forth with
pepXML, where separate fractions are maintained in separate lists (in the same
file I think?). As such, we need to have a general mechanism for telling a data
consumer what to do when they see multiple lists. The assumption is that these
are all "final" results (since that is a key principle for mzIdentML).
For multiple fractions this is okay and supports either separate lists or
single lists (depending on how the search was done). The difficulty comes in
relation to the multiple search engine discussion - should reading software be
expected to determine how to re-rank results if it sees different SIRs (in
different lists) referencing the same spectrum.
Original comment by andrewro...@googlemail.com
on 23 May 2013 at 2:59
Original issue reported on code.google.com by
andrewro...@googlemail.com
on 23 Jan 2013 at 3:32