Closed 524D closed 1 year ago
In the latest (currently unreleased) version of Comet, problem number 2 is fixed and the problematic tag mentioned at problem 1 is removed. Output produced by that Comet version can be read successfully with readMzIdData. While problem number 1 still appears to be a bug in readMzIdData, in practice it means everything works correct with the fixed Comet version and will probably never work with the unfixed Comet version. Therefore I'm closing this issue.
Hi @524D - thank you very much for following up.
For problem 1, it might be because the underlying XML schema that readMzIdData()
uses, that comes from proteowizard with mzR
, isn't recent enough.
I would suggest you look at the PSMatch
package for working with identification data. The PSM()
constructor is the replacement of readMzIdData()
. It won't fix problem 1, as both make use of mzR
, although you might want to try the other (slower) backend, that is based on the mzID
package instead. The PSMatch
package should provide the existing MSnbase
functionality (let me know if anything is missing) and more.
Hi,
When using
readMzIdData
to read mzID data produced by Comet, it fails with the following error:After bisecting the mzID file, it appears there are two problems:
<cvParam cvRef="PSI-MS" accession="MS:1002500" name="peptide passes threshold" value="false" />
in the peptide scores. The value "false" is apparently not accepted byreadMzIdData
, though this seems valid according to the schema.SpectrumIdentificationResult
tags. This seems to be invalid and I will issue a bug report for Comet. It would still be nice ifreadMzIdData
could be a bit more permissive though.For demonstration, the attached ZIP file contains a minimized mzID file with only two peptides, plus as a manually edited file where above problems are fixed.
testdata.zip
I'm using R version 4.3.1, MSnbase version 2.26.0, Bioconductor version = "3.17" on Windows 10.
Best, Rob