Open ajmaurais opened 1 year ago
Hi @ajmaurais, thanks for the bug report. I asked Jimmy about this and it looks like comet uses a rounding mode of HALF_EVEN, which always rounds to the nearest even number. I made the change and created a new release 3.3.1 that actually includes some other optimizations as well. (https://github.com/yeastrc/limelight-import-crux-comet-percolator/releases)
If you have a moment, can you test that on your data?
I am still getting an error:
Crux Comet/Percolator to limelight XML converter
Author: Michael Riffle <mriffle@uw.edu>
See: https://github.com/yeastrc/limelight-import-crux-comet-percolator
Version: 3.3.1
Finding pepXML files... Found 1 file(s).
Finding percolator output file... Done
Parsing percolator log file... Done
Determining versions for pipeline...
Crux version: 4.1-fa9efc63-2021-11-19
Comet version: 2021.01 rev. 0
Percolator version: 3.05.nightly-137-e806a0c5, Build Date Nov 19 2021 19:15:27
Reading comet params... Done.
Reading Percolator XML data into memory... Got 6364 peptides. Done.
Determining # of decimal places in mods in percolator peptide strings...Got: 0
Process pepXML file: comet.target.pep.xml
Reading Comet pepXML data into memory... Done.
Verifying all percolator results have comet results...Encountered error during conversion: Error: Comet results not found for peptide: MGC[57.0215]CGC[513.3063]GGCGGRC[513.3063]SGGCGGGCGGGCGG
java.lang.Exception: Error: Comet results not found for peptide: MGC[57.0215]CGC[513.3063]GGCGGRC[513.3063]SGGCGGGCGGGCGG
at org.yeastrc.limelight.xml.crux_comet_percolator.reader.CometPercolatorValidator.validateData(CometPercolatorValidator.java:40)
at org.yeastrc.limelight.xml.crux_comet_percolator.main.ConverterRunner.convertCruxCometPercolatorToLimelightXML(ConverterRunner.java:98)
at org.yeastrc.limelight.xml.crux_comet_percolator.main.MainProgram.run(MainProgram.java:91)
at picocli.CommandLine.execute(CommandLine.java:1160)
at picocli.CommandLine.access$800(CommandLine.java:141)
at picocli.CommandLine$RunLast.handle(CommandLine.java:1367)
at picocli.CommandLine$RunLast.handle(CommandLine.java:1335)
at picocli.CommandLine$AbstractParseResultHandler.handleParseResult(CommandLine.java:1243)
at picocli.CommandLine.parseWithHandlers(CommandLine.java:1526)
at picocli.CommandLine.run(CommandLine.java:1974)
at picocli.CommandLine.run(CommandLine.java:1904)
at org.yeastrc.limelight.xml.crux_comet_percolator.main.MainProgram.main(MainProgram.java:110)
This regex:
is never matching anything so it thinks there are 0 decimal places in the modification string. Due to the position anchors, the regex won't match peptides with multiple modifications. Even if I remove the position anchors it still doesn't match anything and I am not sure why.
I also tried just changing the default return value of getNumberOfDecimalPlacesInPercolatorMod
to 4, but then I get the same error as before due to the results in the pout.xml file being C[513.3063]
and the modifications that the converter is generating being C[513.3064]
.
For the differential modification of 513.30635, all the masses in the pout.xml file are rounded as C[513.3063]
. That wouldn't be the expected behavior of the ROUND_EVEN
rounding mode correct? If Commet is using ROUND_EVEN
shouldn't they be rounded as C[513.3064]
?
Hmm. OK, I'll generate some test data on my end looking for that mod mass and see if I can duplicate. Thanks again.
Ok, I can also share my data with you if that helps.
That would be great, can you stick it in a google drive? I can't access GSIT systems.
I am doing a comet search with a differential modification on cysteine with a mass of 513.30635.
I am getting an error when I try convert the crux output to a limelight XML:
I believe what is causing the error is that the modification masses in peptide strings in the percolator output file are rounded to 4 decimal places with digits preceding a 5 rounded down. When the limelight converter reads the pep.xml files it has to calculate how the modifications should be encoded in a string to cross reference the percolator and pep.xml results. The limelight converter rounds digits preceding a 5 up. So the same peptide in the percolator results would be encoded as
C[513.3063]
whereas the peptides in the .pep.xml results would be encoded asC[513.3064]
.Ultimately I was able to get the conversion to work by modifying the
RoundingMode
in thegetReportedPeptideStringForSequenceAndMods
function toHALF_DOWN
https://github.com/yeastrc/limelight-import-crux-comet-percolator/blob/74d7e154090182cf5d762499a29b93bd7d71926c/src/main/java/org/yeastrc/limelight/xml/crux_comet_percolator/utils/ReportedPeptideUtils.java#L41
But I don't know if it is always safe to assume that the percolator results will always be rounded down.