Placeholder for exporting to mzML. This functionality should be completed in the ms_io._write_spectra_mzml method.
Non-standard information that has to be associated with each spectrum (see for example the _write_spectra_mgf method):
The cluster identifier.
The original scan number. Currently the _read_spectra_mzml extracts the scan number from the spectrum title, but the title format is different for cluster representatives. I think the best option will be to store the original scan number as a metadata element. This will also need to be handled properly when reading the mzML in _read_spectra_mzml.
Placeholder for exporting to mzML. This functionality should be completed in the
ms_io._write_spectra_mzml
method.Non-standard information that has to be associated with each spectrum (see for example the
_write_spectra_mgf
method):_read_spectra_mzml
extracts the scan number from the spectrum title, but the title format is different for cluster representatives. I think the best option will be to store the original scan number as a metadata element. This will also need to be handled properly when reading the mzML in_read_spectra_mzml
.Run the different scripts as follows:
python spectra_add_cluster.py --spectra 01650b_BA5-TUM_first_pool_75_01_01-3xHCD-1h-R2.mzML --cluster MaRaCluster.clusters_p10.tsv maracluster --out clusters.mgf
python representative.py --filename_in clusters.mgf --filename_out representative.mgf --representative_method best_spectrum --filename_psm 01650b_BA5-TUM_first_pool_75_01_01-3xHCD-1h-R2-features-index-percolator-fdr-filter.idXML --lower_is_better
python evaluate.py --filename_spectra clusters.mgf --filename_representatives representative.mgf --filename_out out.csv --measure avg_dot --measure fraction_by --filename_psm non_existing_representative_ids.idXML