Open PMSeitzer opened 2 years ago
Return to this after starting mass_spc
case https://github.com/calico/mass_spec/issues/752
Implement spectral entropy score
paper: https://www.nature.com/articles/s41592-021-01331-z
source code: https://github.com/YuanyueLi/SpectralEntropy/blob/master/spectral_entropy/spectral_entropy.py#L26-L28
scipy.stats.entropy
:
https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.entropy.html
more information about various spectral scoring approaches: https://www.biorxiv.org/content/biorxiv/early/2022/06/02/2022.06.01.494370.full.pdf
Modified cosine score, as it is explained in the original manuscript (https://www.pnas.org/doi/abs/10.1073/pnas.1203689109):
Vector similarities are calculated for every possible pair of spectra with a minimum of six matching fragment ions (i.e., peaks) with similarity determined by using a modified cosine calculation that takes into account the relative intensities of the fragment ions as well as the precursor m/z difference between the paired spectra
This has come to mean something more specific:
Two peaks are considered a potential match if their m/z ratios lie within the given ‘tolerance’, or if their m/z ratios lie within the tolerance once a mass-shift is applied. The mass shift is simply the difference in precursor-m/z between the two spectra.
So, a peak may match another peak after a mass-shift is applied.
See this implementation: https://github.com/matchms/matchms/blob/master/matchms/similarity/ModifiedCosine.py#L109-L129
It looks like they are matching to both the m/z
and NL m/z
of an observed spectrum (where NL m/z
= precursorMz - fragMz
).
Inspired by ASMS 2024, re-opening this case. Introduced now in modified cosine score, flash entropy, kullback-leibler divergence, Jansen-Shanon divergence, etc.
spectral entropy python: https://github.com/YuanyueLi/SpectralEntropy
Including (but not limited to), modified cosine score, neutral loss match score, and the updated cosine score.
This can be used very generally, and updated in the GUI, but was originally devised as a part of #543, #546, and associated
mass_spec
case https://github.com/calico/mass_spec/issues/752