MStarmans91 / WORC

Workflow for Optimal Radiomics Classification
Other
66 stars 19 forks source link

[BUG] encounter divide by zero error when using multilabel tag #73

Closed hitagisknife closed 1 year ago

hitagisknife commented 1 year ago

Describe the bug At the step of performance/plot_Estimator, it report "stats.py:7028: RuntimeWarning: divide by zero encountered in double_scalars\n z = (bigu - meanrank) / sd", the "sd" is always zero.

And i found it is due to StatisticalTestThreshold.py:86. In this line, the "class1" world always be an empty list. I update the code from line 85-87 and it world work succesfully

#before
  for i in range(n_label):
      class1 = [i for j, i in enumerate(fv) if np.argmax(Y_train[j]) == n_label]
      class2 = [i for j, i in enumerate(fv) if np.argmax(Y_train[j]) != n_label]
#after
  for i_label in range(n_label):
      class1 = [i for j, i in enumerate(fv) if np.argmax(Y_train[j]) == i_label]
      class2 = [i for j, i in enumerate(fv) if np.argmax(Y_train[j]) != i_label]

WORC configuration modus = 'multiclass_classification'

Label example Patient Label_1 Label_2 Label_3 Label_4
id_1 0 1 0 0
id_2 1 0 0 0
id_3 1 0 0 0
id_4 0 0 1 0
id_5 0 0 0 1

Logging error

Traceback (most recent call last):` File "/home/z/anaconda3/envs/worc/lib/python3.9/logging/init.py", line 434, in format return self._format(record) File "/home/z/anaconda3/envs/worc/lib/python3.9/logging/init.py", line 430, in _format return self._fmt % record.dict KeyError: 'tracking_id'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/home/z/anaconda3/envs/worc/lib/python3.9/logging/init.py", line 1083, in emit msg = self.format(record) File "/home/z/anaconda3/envs/worc/lib/python3.9/logging/init.py", line 927, in format return fmt.format(record) File "/home/z/anaconda3/envs/worc/lib/python3.9/logging/init.py", line 666, in format s = self.formatMessage(record) File "/home/z/anaconda3/envs/worc/lib/python3.9/logging/init.py", line 635, in formatMessage return self._style.format(record) File "/home/z/anaconda3/envs/worc/lib/python3.9/logging/init.py", line 436, in format raise ValueError('Formatting field not found in record: %s' % e) ValueError: Formatting field not found in record: 'tracking_id' Call stack: File "/home/z/anaconda3/envs/worc/lib/python3.9/site-packages/WORC-3.6.0-py3.9.egg/WORC/resources/fastr_tools/worc/bin/PlotEstimator.py", line 89, in main() File "/home/z/anaconda3/envs/worc/lib/python3.9/site-packages/WORC-3.6.0-py3.9.egg/WORC/resources/fastr_tools/worc/bin/PlotEstimator.py", line 77, in main plot_estimator_performance(prediction=args.prediction, File "/home/z/anaconda3/envs/worc/lib/python3.9/site-packages/WORC-3.6.0-py3.9.egg/WORC/plotting/plot_estimator_performance.py", line 438, in plot_estimator_performance fitted_model.create_ensemble(X_train_temp, Y_train_temp, File "/home/z/anaconda3/envs/worc/lib/python3.9/site-packages/WORC-3.6.0-py3.9.egg/WORC/classification/SearchCV.py", line 1428, in create_ensemble base_estimator.refit_and_score(X_train, Y_train, p_all, File "/home/z/anaconda3/envs/worc/lib/python3.9/site-packages/WORC-3.6.0-py3.9.egg/WORC/classification/SearchCV.py", line 912, in refit_and_score out = fit_and_score(X_fit, y, self.scoring, File "/home/z/anaconda3/envs/worc/lib/python3.9/site-packages/WORC-3.6.0-py3.9.egg/WORC/classification/fitandscore.py", line 770, in fit_and_score StatisticalSel.fit(X_train, y) File "/home/z/anaconda3/envs/worc/lib/python3.9/site-packages/WORC-3.6.0-py3.9.egg/WORC/featureprocessing/StatisticalTestThreshold.py", line 90, in fit metric_value_temp = self.metric_function(class1, class2, **self.parameters)[1] File "/home/z/anaconda3/envs/worc/lib/python3.9/site-packages/scipy/stats/stats.py", line 7028, in mannwhitneyu z = (bigu - meanrank) / sd File "/home/z/anaconda3/envs/worc/lib/python3.9/warnings.py", line 109, in _showwarnmsg sw(msg.message, msg.category, msg.filename, msg.lineno, Message: '%s' Arguments: ('/home/z/anaconda3/envs/worc/lib/python3.9/site-packages/scipy/stats/stats.py:7028: RuntimeWarning: divide by zero encountered in `double_scalars\n z = (bigu - meanrank) / sd\n',)

Desktop (please complete the following information):

MStarmans91 commented 1 year ago

Sorry for the late reply, I seemed to have totally missed your issue. Thanks for spotting it, I've incorporated your suggestion in the latest v3.6.2 release. #76