CellProfiler / CellProfiler-Analyst

Open-source software for exploring and analyzing large, high-dimensional image-derived data.
http://cellprofileranalyst.org
Other
145 stars 72 forks source link

[Python3_Port] Error when training LogisticRegression classifier #289

Open pearlryder opened 3 years ago

pearlryder commented 3 years ago

OSX, 10.15.7, example training set or my own data

open classifier, select LogisticRegression, press Train. This error results:

An error occurred in the program:
AttributeError: 'str' object has no attribute 'decode'

Traceback (most recent call last):
  File "/Users/pryder/GitHub/CellProfiler-Analyst/cpa/classifier.py", line 1557, in OnTrainClassifier
    self.TrainClassifier()
  File "/Users/pryder/GitHub/CellProfiler-Analyst/cpa/classifier.py", line 1607, in TrainClassifier
    self.algorithm.Train(self.trainingSet.label_array, self.trainingSet.values, output)
  File "/Users/pryder/GitHub/CellProfiler-Analyst/cpa/generalclassifier.py", line 222, in Train
    self.classifier.fit(values, labels)
  File "/Users/pryder/Library/Python/3.8/lib/python/site-packages/sklearn/linear_model/_logistic.py", line 1407, in fit
    fold_coefs_ = Parallel(n_jobs=self.n_jobs, verbose=self.verbose,
  File "/Users/pryder/.virtualenvs/CellProfiler-Analyst/lib/python3.8/site-packages/joblib/parallel.py", line 1041, in __call__
    if self.dispatch_one_batch(iterator):
  File "/Users/pryder/.virtualenvs/CellProfiler-Analyst/lib/python3.8/site-packages/joblib/parallel.py", line 859, in dispatch_one_batch
    self._dispatch(tasks)
  File "/Users/pryder/.virtualenvs/CellProfiler-Analyst/lib/python3.8/site-packages/joblib/parallel.py", line 777, in _dispatch
    job = self._backend.apply_async(batch, callback=cb)
  File "/Users/pryder/.virtualenvs/CellProfiler-Analyst/lib/python3.8/site-packages/joblib/_parallel_backends.py", line 208, in apply_async
    result = ImmediateResult(func)
  File "/Users/pryder/.virtualenvs/CellProfiler-Analyst/lib/python3.8/site-packages/joblib/_parallel_backends.py", line 572, in __init__
    self.results = batch()
  File "/Users/pryder/.virtualenvs/CellProfiler-Analyst/lib/python3.8/site-packages/joblib/parallel.py", line 262, in __call__
    return [func(*args, **kwargs)
  File "/Users/pryder/.virtualenvs/CellProfiler-Analyst/lib/python3.8/site-packages/joblib/parallel.py", line 262, in <listcomp>
    return [func(*args, **kwargs)
  File "/Users/pryder/Library/Python/3.8/lib/python/site-packages/sklearn/linear_model/_logistic.py", line 762, in _logistic_regression_path
    n_iter_i = _check_optimize_result(
  File "/Users/pryder/Library/Python/3.8/lib/python/site-packages/sklearn/utils/optimize.py", line 243, in _check_optimize_result
    ).format(solver, result.status, result.message.decode("latin1"))
bethac07 commented 3 years ago

Looks like a bug in sklearn, you can report it there if it isn't already reported!

DavidStirling commented 3 years ago

open classifier, select LogisticRegression, press Train.

'Train' shouldn't be available unless you have data in the bins. Are you loading a training set or filling the bins manually?

DavidStirling commented 3 years ago

With the example set it's working fine on my end. Could you pip freeze and report your sklearn version?

pearlryder commented 3 years ago

scikit-learn==0.24.1

And I loaded from a training set. Other classifiers do train, so there seems to be data in the bins.

DavidStirling commented 3 years ago

Same version here with no error, using the example training set you should be getting a ConvergenceWarning: lbfgs failed to converge (status=1): STOP: TOTAL NO. of ITERATIONS REACHED LIMIT. error in the console when using this classifier. Do you see this? If not I guess that's what's failing to log.

pearlryder commented 3 years ago

I don't see that message in the console. My understanding then is that the classifier never converges but the error doesn't log properly on my system. That seem right?

DavidStirling commented 3 years ago

Yup, I'd check sklearn's issues for this. It could be related to our exception catcher, but I don't think it is.