HUPO-PSI / mzTab

mzTab Reporting MS-based Proteomics and Metabolomics Results
https://hupo-psi.github.io/mzTab
39 stars 17 forks source link

In the case of Identification Complete mzTab files, the numbers of protein columns grow very fast because of the mandatory fields referencing ms run. #18

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 9 years ago

[Uploading peptides_1_1_0.pride.mztab.zip…]()

In the case of Identification Complete mzTab files, the numbers of protein 
columns grow very fast because of the mandatory fields referencing ms run.

This situation is not very common but you can find it when converting a 
mzIdentML file generated with a tool like pep2pro to mztab.

As a temporary solution the file can be generated as a Identification Summary 
because in this case these fields are not mandatory.

Example:
20 ms_run
2 protein_search_engine_score

-mandatory columns in an Identification Complete mzTab 
ms_run/protein_search_engine_score unrelated = 10 columns
-best_search_engine_score[1-n] = num protein_search_engines_score = 2 columns
-search_engine_score[1-n]_ms_run[1-n] = num protein_search_engines_score x num 
ms_run = 40 columns
-num_psms_ms_run[1-n] = num ms_run = 20 columns
-num_peptides_distinct_ms_run[1-n] = num ms_run = 20 columns 
-num_peptide_unique_ms_run[1-n] = num ms_run = 20 columns

Protein section total columns = 112 columns

Original issue reported on code.google.com by noedelta on 9 Oct 2014 at 3:13

ypriverol commented 8 years ago

@timosachsenberg what do you think?

timosachsenberg commented 8 years ago

Yes, I think an identificaiiton summary file is correct. Otherwise one would need to discuss on downgrading the requirements for the complete file in a new mzTab version.