epoyraz / mztab

Automatically exported from code.google.com/p/mztab
0 stars 0 forks source link

In the case of Identification Complete mzTab files, the numbers of protein columns grow very fast because of the mandatory fields referencing ms run. #18

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
In the case of Identification Complete mzTab files, the numbers of protein 
columns grow very fast because of the mandatory fields referencing ms run.

This situation is not very common but you can find it when converting a 
mzIdentML file generated with a tool like pep2pro to mztab.

As a temporary solution the file can be generated as a Identification Summary 
because in this case these fields are not mandatory.

Example:
20 ms_run
2 protein_search_engine_score

-mandatory columns in an Identification Complete mzTab 
ms_run/protein_search_engine_score unrelated = 10 columns
-best_search_engine_score[1-n] = num protein_search_engines_score = 2 columns
-search_engine_score[1-n]_ms_run[1-n] = num protein_search_engines_score x num 
ms_run = 40 columns
-num_psms_ms_run[1-n] = num ms_run = 20 columns
-num_peptides_distinct_ms_run[1-n] = num ms_run = 20 columns 
-num_peptide_unique_ms_run[1-n] = num ms_run = 20 columns

Protein section total columns = 112 columns

Original issue reported on code.google.com by noedelta on 9 Oct 2014 at 3:13