TimOliverMaier / pystoms

Analysis of precursor MS features using bayesian statistics.
GNU General Public License v3.0
0 stars 0 forks source link

load_hull_data_3d fails for some features #1

Open TimOliverMaier opened 2 years ago

TimOliverMaier commented 2 years ago

For reproduction feature 606 in M210115_001_Slot1-1_1_850.d

data_handle = PyTimsDataHandle('/home/tim/Master/MassSpecDaten/M210115_001_Slot1-1_1_850.d/')
# precursors are listed in precursor table
precursor_table = data_handle.get_selected_precursors()

# ectract some features with random ids
features = [2928,2320,4011,3016,606,3026,702,1220]
accepted_feature_ids = []
feature_data = []
charges = []

for feature_id in features:
    feature = FeatureLoaderDDA(data_handle,feature_id)
    # estimate feature hull boundaries with averagine model for isotopic pattern and gaussian model for IMS
    data_tmp = feature.load_hull_data_3d(intensity_min=0,ims_model="gaussian",plot_feature=False)
    if len(data_tmp)<10:
        continue
    accepted_feature_ids.append(feature_id)
    feature_data.append(data_tmp.nlargest(10,columns="Intensity"))
    charges.append(feature.charge)

n = len(feature_data)
RuntimeError                              Traceback (most recent call last)
/home/tim/Master/prototyping/TutorialsAndArchive/model_parallel.ipynb Cell 8' in <cell line: 13>()
     [14](vscode-notebook-cell:/home/tim/Master/prototyping/TutorialsAndArchive/model_parallel.ipynb#ch0000009?line=13)[ feature = FeatureLoaderDDA(data_handle,feature_id)
     ]()[15](vscode-notebook-cell:/home/tim/Master/prototyping/TutorialsAndArchive/model_parallel.ipynb#ch0000009?line=14)[ # estimate feature hull boundaries with averagine model for isotopic pattern and gaussian model for IMS
---> ]()[16](vscode-notebook-cell:/home/tim/Master/prototyping/TutorialsAndArchive/model_parallel.ipynb#ch0000009?line=15)[ data_tmp = feature.load_hull_data_3d(intensity_min=0,ims_model="gaussian",plot_feature=False)
     ]()[17](vscode-notebook-cell:/home/tim/Master/prototyping/TutorialsAndArchive/model_parallel.ipynb#ch0000009?line=16)[ if len(data_tmp)<10:
     ]()[18](vscode-notebook-cell:/home/tim/Master/prototyping/TutorialsAndArchive/model_parallel.ipynb#ch0000009?line=17)[     continue

File ~/Master/pystoms/src/pystoms/feature_loader_dda.py:156, in FeatureLoaderDDA.load_hull_data_3d(self, intensity_min, ims_model, mz_peak_width, averagine_prob_target, plot_feature, scan_range)
    ]()[149](file:///home/tim/Master/pystoms/src/pystoms/feature_loader_dda.py?line=148)[     mono_profile_data = self.get_monoisotopic_profile(
    ]()[150](file:///home/tim/Master/pystoms/src/pystoms/feature_loader_dda.py?line=149)[                                   self.monoisotopic_mz,
    ]()[151](file:///home/tim/Master/pystoms/src/pystoms/feature_loader_dda.py?line=150)[                                   self.scan_number,
    ]()[152](file:///home/tim/Master/pystoms/src/pystoms/feature_loader_dda.py?line=151)[                                   frame_init,
    ]()[153](file:///home/tim/Master/pystoms/src/pystoms/feature_loader_dda.py?line=152)[                                   scan_range//2,
    ]()[154](file:///home/tim/Master/pystoms/src/pystoms/feature_loader_dda.py?line=153)[                                   mz_peak_width/2)
    ]()[155](file:///home/tim/Master/pystoms/src/pystoms/feature_loader_dda.py?line=154)[     # estimate scan boundaries
--> ]()[156](file:///home/tim/Master/pystoms/src/pystoms/feature_loader_dda.py?line=155)[     scan_min_estimated,scan_max_estimated = self._get_scan_boundaries(
    ]()[157](file:///home/tim/Master/pystoms/src/pystoms/feature_loader_dda.py?line=156)[                                                       mono_profile_data,
    ]()[158](file:///home/tim/Master/pystoms/src/pystoms/feature_loader_dda.py?line=157)[                                                       ims_model)
    ]()[160](file:///home/tim/Master/pystoms/src/pystoms/feature_loader_dda.py?line=159)[ elif ims_model == "DBSCAN":
    ]()[162](file:///home/tim/Master/pystoms/src/pystoms/feature_loader_dda.py?line=161)[     clustered_data, mi_cluster_id = precursor_dbscan_3d(
    ]()[163](file:///home/tim/Master/pystoms/src/pystoms/feature_loader_dda.py?line=162)[                                       frame_init,
    ]()[164](file:///home/tim/Master/pystoms/src/pystoms/feature_loader_dda.py?line=163)[                                       add_point = (self.monoisotopic_mz,
    ]()[165](file:///home/tim/Master/pystoms/src/pystoms/feature_loader_dda.py?line=164)[                                                   self.scan_number),
    ]()[166](file:///home/tim/Master/pystoms/src/pystoms/feature_loader_dda.py?line=165)[                                       plot=plot_feature)

File ~/Master/pystoms/src/pystoms/feature_loader_dda.py:72, in FeatureLoaderDDA._get_scan_boundaries(self, datapoints, ims_model, cut_off_left, cut_off_right)
     ]()[67](file:///home/tim/Master/pystoms/src/pystoms/feature_loader_dda.py?line=66)[ y = datapoints.T[1]
     ]()[69](file:///home/tim/Master/pystoms/src/pystoms/feature_loader_dda.py?line=68)[ if ims_model == "gaussian":
     ]()[70](file:///home/tim/Master/pystoms/src/pystoms/feature_loader_dda.py?line=69)[     # fit model function
---> ]()[72](file:///home/tim/Master/pystoms/src/pystoms/feature_loader_dda.py?line=71)[     param_opt,param_cov = curve_fit(_gauss, # pylint: disable=unused-variable
     ]()[73](file:///home/tim/Master/pystoms/src/pystoms/feature_loader_dda.py?line=72)[                                     x,
     ]()[74](file:///home/tim/Master/pystoms/src/pystoms/feature_loader_dda.py?line=73)[                                     y,
     ]()[75](file:///home/tim/Master/pystoms/src/pystoms/feature_loader_dda.py?line=74)[                                     bounds=([y.min(),x.min(),0],
     ]()[76](file:///home/tim/Master/pystoms/src/pystoms/feature_loader_dda.py?line=75)[                                     [y.max(),x.max(),inf]))
     ]()[78](file:///home/tim/Master/pystoms/src/pystoms/feature_loader_dda.py?line=77)[     # instantiate a normal distribution with calculated parameters
     ]()[79](file:///home/tim/Master/pystoms/src/pystoms/feature_loader_dda.py?line=78)[     fit_dist = norm(param_opt[1],param_opt[2])

File ~/miniconda3/envs/proteolizard/lib/python3.8/site-packages/scipy/optimize/minpack.py:804, in curve_fit(f, xdata, ydata, p0, sigma, absolute_sigma, check_finite, bounds, method, jac, **kwargs)
    ]()[800](file:///home/tim/miniconda3/envs/proteolizard/lib/python3.8/site-packages/scipy/optimize/minpack.py?line=799)[ res = least_squares(func, p0, jac=jac, bounds=bounds, method=method,
    ]()[801](file:///home/tim/miniconda3/envs/proteolizard/lib/python3.8/site-packages/scipy/optimize/minpack.py?line=800)[                     **kwargs)
    ]()[803](file:///home/tim/miniconda3/envs/proteolizard/lib/python3.8/site-packages/scipy/optimize/minpack.py?line=802)[ if not res.success:
--> ]()[804](file:///home/tim/miniconda3/envs/proteolizard/lib/python3.8/site-packages/scipy/optimize/minpack.py?line=803)[     raise RuntimeError("Optimal parameters not found: " + res.message)
    ]()[806](file:///home/tim/miniconda3/envs/proteolizard/lib/python3.8/site-packages/scipy/optimize/minpack.py?line=805)[ ysize = len(res.fun)
    ]()[807](file:///home/tim/miniconda3/envs/proteolizard/lib/python3.8/site-packages/scipy/optimize/minpack.py?line=806)[ cost = 2 * res.cost  # res.cost is half sum of squares!

RuntimeError: Optimal parameters not found: The maximum number of function evaluations is exceeded.]()
TimOliverMaier commented 2 years ago

An Addition to this issue: Scipy sometimes cannot find optimal parameters for gauss function whilst estimating the scan boundaries. This happens for feature 118829 i.e. (same dataset). This might be caused by a initially too widely chosen scan range