OpenMS / pyopenms-docs

pyOpenMS readthedocs documentation, additional utilities, addons, scripts, and examples.
https://pyopenms.readthedocs.io
Other
42 stars 50 forks source link

Change in Isotope Pattern Generator #411

Closed dlon450 closed 10 months ago

dlon450 commented 10 months ago

Describe the problem you encountered Hi OpenMS team, just wanted to ask about the difference between pyopenms versions 3.0.0 and 2.7.0 regarding the CoarseIsotopePatternGenerator and FineIsotopePatternGenerator functions as the MZ and intensity values seem noticeably different.

To Reproduce Here's one example. In version 3.0.0:

>>> find_isotope_pattern_coarse('C636H1007N167O178S5PtH-2', peak=14194.32)         
(array([14189.31094916, 14190.314304  , 14191.31765884, 14192.32101367,
       14193.32436851, 14194.32772335, 14195.33107819, 14196.33443303,
       14197.33778786, 14198.3411427 ]), array([0.00363849, 0.01054412, 0.02373276, 0.04367016, 0.0680778 ,
       0.09226516, 0.11084374, 0.11982168, 0.11793806, 0.10671093]))

Whereas in version 2.7.0:

>>> find_isotope_pattern_coarse('C636H1007N167O178S5PtH-2', peak=14194.32)         
(array([14189.31726057, 14190.32061541, 14191.32397025, 14192.32732509,
       14193.33067993, 14194.33403476, 14195.3373896 , 14196.34074444,
       14197.34409928, 14198.34745411]), array([0.04367016, 0.0680778 , 0.09226516, 0.11084373, 0.11982169,
       0.11793808, 0.10671095, 0.08945209, 0.06992197, 0.05124441]))

Function (replace CoarseIsotopePatternGenerator with FineIsotopePatternGenerator to reproduce with hyperfine isotope distribution):

from pyopenms import EmpiricalFormula, CoarseIsotopePatternGenerator

def find_isotope_pattern_coarse(formula_str: str, peak=None, interval=5.):
    '''
    Return coarse theoretical distribution of intensities
    '''
    seq_formula = EmpiricalFormula(formula_str)
    isotopes = seq_formula.getIsotopeDistribution( CoarseIsotopePatternGenerator() )
    isotopes_container = isotopes.getContainer()

    if peak is not None:
        lower, upper = binarySearchInterval(isotopes_container, np.floor(peak - interval), np.ceil(peak + interval))
        isotopes_container = isotopes_container[lower:upper]

    o = np.transpose([[iso.getMZ(), iso.getIntensity()] for iso in isotopes_container])
    return o[0], o[1]

def binarySearchInterval(isotopes, lbound, ubound):
    '''
    Find isotopes with masses in given interval
    '''
    low = 0
    high = len(isotopes) - 1

    lower_idx = binarySearch(isotopes, low, high, lbound)
    upper_idx = binarySearch(isotopes, lower_idx, high, ubound) - 1
    return lower_idx, upper_idx

def binarySearch(isotopes, low, high, X):
    '''
    Search for index position of mass X in isotopes
    '''
    mid = 0
    while low <= high:
        mid = (high + low) // 2
        mass = isotopes[mid].getMZ()
        if mass < X:
            low = mid + 1
        elif mass > X:
            high = mid - 1
        else:
            return mid
    return high + 1

In particular, the isotope with peak intensity had an MZ of 14193.33067993 in version 2.7.0 (the MZ desired for our work) whereas in version 3.0.0 has an MZ of 14196.33443303. Does anyone know what might be causing this? Thanks!

System information:

jpfeuffer commented 10 months ago

Hi! Thanks for reporting!

I suspect it is indeed a bug introduced during the correction of our definition of monoisotopic (most abundant instead of lightest).

Can you confirm by checking a formula without elements whose monoisotope is not the lightest (in your case probably without platinum)? If my theory is correct those patterns should not be affected.

jpfeuffer commented 10 months ago

By the way, FineIsotope should not be affected (which is also what I am seeing when I run your code).

dlon450 commented 10 months ago

Thanks for your response. Without platinum the function returns the same value (your theory is correct)! You are also right that the hyperfine generator is not affected.

dlon450 commented 10 months ago

Hey @jpfeuffer, thank you very much for fixing this issue. Just wondering how I would install the new update? Is it fine just to pip install pyopenms? Or will I need to wait for the next version?

jpfeuffer commented 10 months ago

Hi! For now you will need to use our nightly pypi package to try the fix: https://pyopenms.readthedocs.io/en/latest/user_guide/installation.html#nightly-ci-wheels

pip install --index-url https://pypi.cs.uni-tuebingen.de/simple/ pyopenms