Closed mattiat closed 13 years ago
Hopefully connected with https://github.com/percolator/percolator/issues#issue/13
So I would prefer that you keep scanNumber="[scan #]". So, this is just sqt2pin having a problem -- percolator can handle multiple fragSpectrumScan with same scanNumber. Right?
On Mon, Mar 14, 2011 at 4:31 PM, mattiat reply@reply.github.com wrote:
Current psm grouping (by scan number) seems to cause information from different files to be retained in memory longer than necessary. Consider instead splitting the fragSpectrumScan: scanNumber="[scan #]_[Filename]".
I noticed that the schema groups psms first by scan number. So within the
tag, there are multiple PSMs all for spectra with the same scan number but with different charge states and from different files. For example (I've removed some of the attributes) Seems like this would require that all the files be read in and stored before the pin.xml can be written out. Perhaps percolator requires that the psms be grouped in this way, but if not maybe it's possible to change the schema slightly so that the pin.xml file is written as the .sqt files are read so that there's no need to store much in memory. ... ... ... ... ... ... TODO: _ discuss with Lukas
Yes, Percolator has no problem handling fragSpectrumScan with same scanNumber. For each psm in a pin file, the fragSpectrumScan's scanNumber is read and stored in a PSMDescription object. It is not used as an id and CAN be duplicated. To make sure this is the case, I compared the outputs percolator on two almost identical pin files, the only difference being that a fragSpectrumScans had been "broken down" (ie its psms had been grouped into multiple identical fragSpectrumScans). The results were identical.
Current behavior of sqt2pin:
<fragSpectrumScan experimentalMassToCharge="912.6508" scanNumber="35"> <peptideSpectrumMatch id="FileName_35_2_1"></... <peptideSpectrumMatch id="FileName_35_4_1"></... <peptideSpectrumMatch id="DifferentFileName_35_2_1"></... </fragSpectrumScan>
Modify to:
<fragSpectrumScan experimentalMassToCharge="912.6508" scanNumber="35"> <peptideSpectrumMatch id="FileName_35_2_1"></... <peptideSpectrumMatch id="FileName_35_4_1"></... </fragSpectrumScan> <fragSpectrumScan experimentalMassToCharge="912.6508" scanNumber="35"> <peptideSpectrumMatch id="DifferentFileName_35_2_1"></... </fragSpectrumScan>
So all the three psms have the same mass. That must be a bug...
Cheers -Lukas On Apr 13, 2011 11:36 AM, "mattiat" < reply@reply.github.com> wrote:
Current behavior of sqt2pin:
... ... ... ... ... ... Modify to:
... ... ... ... ... ... Reply to this email directly or view it on GitHub: https://github.com/percolator/percolator/issues/22#comment_994483
SqtReader::translateSqtFileToXML() has as a parameter a FragSpectrumScanDatabase where all psms are stored.
FIX: pass a vector
New psm grouping does not solve Genn's problem: https://github.com/percolator/percolator/issues#issue/13.
Mattia,
do not build in the path /scratch/ into any of your code. Its a path that is not supported by any of the major platforms. It's local to cbr and maybe pdc. Use /tmp instead.
Cheers
Lukas Käll http://kaell.org Center for Biomembrane Research Dep. of Biochemistry and Biophysics Stockholms Universitet SE-10691 Stockholm, Sweden Tel: +46 8 162947 Fax: +46 8 153679
On Mon, May 23, 2011 at 13:44, mattiat reply@reply.github.com wrote:
New psm grouping does not solve Genn's problem: https://github.com/percolator/percolator/issues#issue/13.
Reply to this email directly or view it on GitHub: https://github.com/percolator/percolator/issues/22#comment_1221195
Current psm grouping (by scan number) seems to cause information from different files to be retained in memory longer than necessary. Consider instead splitting the fragSpectrumScan: scanNumber="[scan #]_[Filename]".