matchms / matchms

Python library for processing (tandem) mass spectrometry data and for computing spectral similarities.
Apache License 2.0
170 stars 57 forks source link

Spectra mergin as pre-processing step #140

Open bachi55 opened 3 years ago

bachi55 commented 3 years ago

Is your feature request related to a problem? Please describe. Spectra databases such as Massbank provide MS/MS of single compounds for multiple collision energies. See for example: AU203901 @10ev, AU203902 @10ev and AU203903 @30ev. If we want to compute the spectrum similarity of this compound to another one, we would like to use all available information. One way to do that, is to merge all the spectra of all collision energies and use the merged spectrum for further analysis.

This requests relates to #9.

Describe the solution you'd like It would be nice to have a method, as part of your pre-processing pipeline, that merges spectra of the same compound into a single one. The determination, which spectra should be merged, can be left to the user.

Any good starting points? In principle one could use some function like:

def merge_spectra(list_of_spectra, ppm=5):
    mz_out = []  # m/z
    int_out = []  # intensities
    for s in list_of_spectra:
        mz_out += s.get_mzs()
        int_out += s.get_ints()

    sout = Spectra(mz_out, int_out)
    # merge peaks that are very close (see code in #9)
    sout.merge_close_peaks(ppm)
    return sout

Best, Eric

florian-huber commented 3 years ago

Thanks for starting this issu @bachi55 . To me this seems clearly useful to include into matchms.

A few further questions, first the practical one: Would you be interested in implementing this? Or work on a PR together? I think we can also implement it from our side, but that could take some time.

And some more technical things as well: The function you sketch above seems a good starting point to me. Two things that crossed my mind:

github-actions[bot] commented 1 week ago

This issue is stale because it has been open for 180 days with no activity.