Open colin-combe opened 6 years ago
I think it is a good question and we need to sort that out. By any chance, if you have any file which uses the Mascot query number (MS:1001528) format in CV param, please send us, which helps us for debugging. Thanks
I don't have an example, I also can't find one - I just noticed the seeming inconsistency between the definition of the cv term and the code.
However, its maybe not such a big problem - I notice the mzIdentML 1.2.0 schema (para 5.1.2) does not list MS:1001528 as a legal way of referencing a spectrum identification.
There is an example mzid file using MS:1001528 at ftp://ftp.pride.ebi.ac.uk/pride/data/archive/2014/01/PXD000198/1007.mzid
It seems this project is a partial project and the spectra reference(file:///DATA.TXT) inside 1007.mzid file is missing in the project. We might have to find a better example.
MS:1001528 is not even allowed in 1.1.0. version. I think we might need to remove that code in next release to avoid any confusions.
Hi - we have been looking at different ways of parsing mzIdentML files. PRIDEs ms-data-core-api looks to be the most complete solution, for example, other libraries do not deal with the different formats used for spectrum ids.
I have a question regarding the formats used for spectrum ids and the code at: https://github.com/PRIDE-Utilities/ms-data-core-api/blob/4b5f9f8d8a87c03b37a9652492a95aec029c1ca9/src/main/java/uk/ac/ebi/pride/utilities/data/utils/MzIdentMLUtils.java#L55-L82
As I read it, in the case where the fileIdFormat is Constants.SpecIdFormat.MASCOT_QUERY_NUM or Constants.SpecIdFormat.MULTI_PEAK_LIST_NATIVE_ID then one is added to the spectrum id. I took this to mean that these formats use zero-based indexes whereas the norm for these formats is to use one based indexes.
This is the case for the multiple peak list nativeID format (MS:1000774) which says:
However, for the Mascot query number (MS:1001528) it says:
So, finally getting to my question, why is one added to the spectrum id for Mascot query number format when the corresponding CV term says it is already one-based?
cheers, Colin