Closed wkumler closed 6 days ago
Preliminary results from multi_py_package_comp
are shown below for a single spectrum access:
Suggests that pyOpenMS is the main competitor and pymzml takes second so it's just a question of whether they're also optimized for chrom extraction and rtrange extraction as well.
import pyopenms
from pyteomics import mzml
import pymzml
def pyopenms_fun():
exp = pyopenms.MSExperiment()
pyopenms.MzMLFile().load("demo_data/180205_Poo_TruePoo_Full1.mzML", exp)
mz_intensity_pyopenms = [(spec.get_peaks()[0], spec.get_peaks()[1]) for spec in exp]
return(mz_intensity_pyopenms)
def pyteomics_fun():
mz_intensity_pyteomics = [(spec["m/z array"], spec["intensity array"]) for spec in mzml.MzML("demo_data/180205_Poo_TruePoo_Full1.mzML")]
return(mz_intensity_pyteomics)
def pymzml_fun():
run = pymzml.run.Reader("demo_data/180205_Poo_TruePoo_Full1.mzML")
mz_intensity_pymzml = [(spec.mz, spec.i) for spec in run]
return(mz_intensity_pymzml)
import timeit
pyopenms_times = timeit.repeat('pyopenms_fun()', globals=globals(), number=1, repeat = 10)
pyteomics_times = timeit.repeat('pyteomics_fun()', globals=globals(), number=1, repeat = 10)
pymzml_times = timeit.repeat('pymzml_fun()', globals=globals(), number=1, repeat = 10)
import matplotlib.pyplot as plt
plt.boxplot([pyopenms_times, pyteomics_times, pymzml_times], tick_labels=['pyOpenMS', 'pyteomics', 'pymzml'])
plt.show()
Lots of different ways to get into the MS data. Not sure which ones are fastest/easiest - could be worth doing the direct comparison.
Looks like spectrum_utils uses Numba for optimization after the fact - not sure I love this since it could be just a single call that's being made. spectrum_utils describes itself as "IO functionality to read spectra from MS data files is not directly included in spectrum_utils. Instead you can use excellent libraries to read a variety of mass spectrometry data formats such as Pyteomics or pymzML." So maybe it's not meant to access the spectrum? But it certainly seems to do that... I'm confused.
Looks like pyOpenMS is just a wrapper for C code
Unclear what pyteomics is at all
There's a comparison between pyOpenMS and pymzml here:
and a comparison between pymzml, pyOpenMS, and spectrum_utils here: