EpistasisLab / tpot

A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.
http://epistasislab.github.io/tpot/
GNU Lesser General Public License v3.0
9.72k stars 1.57k forks source link

Cannot finish optimizing by TPOT 0.70 without any errors #390

Closed YukiSakai1209 closed 7 years ago

YukiSakai1209 commented 7 years ago

TPOT 0.6x has worked well in my environment (Ubuntu 14.04, anaconda3-4.0.0).

After I upgraded TPOT from 0.68 into 0.70 yesterday, optimizing steps stop in several minutes after its start and did not finish. There are no errors.

This happens even in examples provided by TPOT (i.e. MNIST or Iris).

To reproduce it in my environment

  1. Creates TPOT instance
  2. Calls TPOT fit() function with training data
  3. TPOT stops in several minutes without any errors.

    I want to fix the problem, since features provided by TPOT 0.70 are really important.

weixuanfu commented 7 years ago

Thank you for reporting this issue here. I will look into it.

weixuanfu commented 7 years ago

I tested both Iris and MNIST examples with TPOT 0.7.0 on my Ubuntu 16.04 environment with anaconda3. Both examples worked fine in my environment. I will try to use virtual machine later for checking whether this issue is OS-specific.

Meanwhile, could you please try either of these two examples again with setting verbosity=3. Maybe error messages could show up. Or you may try to create a python3.5 environment using a simple command conda create -n py35 python=3.5 anaconda and then activate the new environment with a command source activate py35 (check more details). After that, you can install TPOT based on manual in the new python environment for testing TPOT 0.7. Please let me know if it can solve the issue for now.

YukiSakai1209 commented 7 years ago

Thank you very much for your information.

If I try 2 examples with setting verbosity=3, it seems to stop after several steps.

Iris DeprecationWarning: This module was deprecated in version 0.18 in favor of the model_selection module into which all the refactored classes and functions are moved. Also note that the interface of the new CV iterators are different from that of this module. This module will be removed in 0.20. "This module will be removed in 0.20.", DeprecationWarning) 30 operators have been imported by TPOT. _pre_test decorator: _generate: num_test=0 Found array with 0 feature(s) (shape=(50, 0)) while a minimum of 1 is required. _pre_test decorator: _generate: num_test=0 Unsupported set of arguments: The combination of penalty='l1' and loss='squared_hinge' are not supported when dual=True, Parameters: penalty='l1', loss='squared_hinge', dual=True _pre_test decorator: _generate: num_test=0 max_features must be in (0, n_features] _pre_test decorator: _generate: num_test=1 max_features must be in (0, n_features] _pre_test decorator: _generate: num_test=0 Input X must be non-negative _pre_test decorator: _generate: num_test=1 Input X must be non-negative _pre_test decorator: _generate: num_test=0 Found array with 0 feature(s) (shape=(50, 0)) while a minimum of 1 is required. _pre_test decorator: _generate: num_test=1 Unsupported set of arguments: The combination of penalty='l1' and loss='logistic_regression' are not supported when dual=True, Parameters: penalty='l1', loss='logistic_regression', dual=True _pre_test decorator: _generate: num_test=0 max_features must be in (0, n_features] _pre_test decorator: _generate: num_test=0 Input X must be non-negative _pre_test decorator: _generate: num_test=1 precomputed was provided as affinity. Ward can only work with euclidean distances. _pre_test decorator: _generate: num_test=2 Input X must be non-negative _pre_test decorator: _generate: num_test=0 Unsupported set of arguments: The combination of penalty='l1' and loss='hinge' is not supported, Parameters: penalty='l1', loss='hinge', dual=True _pre_test decorator: _generate: num_test=0 l1 was provided as affinity. Ward can only work with euclidean distances. _pre_test decorator: _generate: num_test=0 Expected n_neighbors <= n_samples, but n_samples = 50, n_neighbors = 67 Optimization Progress: 0%| | 0/300 [00:00<?, ?pipeline/s]Invalid pipeline encountered. Skipping its evaluation. Invalid pipeline encountered. Skipping its evaluation. Optimization Progress: 4%|▍ | 12/300 [00:20<28:12, 5.88s/pipeline]

MNIST DeprecationWarning: This module was deprecated in version 0.18 in favor of the model_selection module into which all the refactored classes and functions are moved. Also note that the interface of the new CV iterators are different from that of this module. This module will be removed in 0.20. "This module will be removed in 0.20.", DeprecationWarning) 30 operators have been imported by TPOT. _pre_test decorator: _generate: num_test=0 max_features must be in (0, n_features] _pre_test decorator: _generate: num_test=1 Input X must be non-negative _pre_test decorator: _generate: num_test=0 Unsupported set of arguments: The combination of penalty='l1' and loss='squared_hinge' are not supported when dual=True, Parameters: penalty='l1', loss='squared_hinge', dual=True _pre_test decorator: _generate: num_test=1 Input X must be non-negative _pre_test decorator: _generate: num_test=0 Input X must be non-negative _pre_test decorator: _generate: numtest=1 coef is only available when using a linear kernel _pre_test decorator: _generate: num_test=0 Input X must be non-negative _pre_test decorator: _generate: num_test=0 Unsupported set of arguments: The combination of penalty='l1' and loss='squared_hinge' are not supported when dual=True, Parameters: penalty='l1', loss='squared_hinge', dual=True _pre_test decorator: _generate: num_test=0 Unsupported set of arguments: The combination of penalty='l1' and loss='squared_hinge' are not supported when dual=True, Parameters: penalty='l1', loss='squared_hinge', dual=True _pre_test decorator: _generate: num_test=0 Unsupported set of arguments: The combination of penalty='l1' and loss='squared_hinge' are not supported when dual=True, Parameters: penalty='l1', loss='squared_hinge', dual=True _pre_test decorator: _generate: num_test=0 precomputed was provided as affinity. Ward can only work with euclidean distances. Optimization Progress: 0%| | 0/300 [00:00<?, ?pipeline/s]_pre_test decorator: _generate: num_test=0 Unsupported set of arguments: The combination of penalty='l2' and loss='hinge' are not supported when dual=False, Parameters: penalty='l2', loss='hinge', dual=False _pre_test decorator: _generate: num_test=0 Input X must be non-negative Invalid pipeline encountered. Skipping its evaluation. Invalid pipeline encountered. Skipping its evaluation. Optimization Progress: 5%|▌ | 16/300 [01:01<13:48, 2.92s/pipeline]

However if I recreate another python3.5 environment, those things are completely solved. I'm not sure the reason why, but I'm really sorry to disturb you. I tried uninstall and reinstall TPOT several times, but I did not recreate python environment itself. I should do it by myself.

Of course, several new features work very well.

Thank you very much for your kind help.

weixuanfu commented 7 years ago

Great, good to know the issue is solved. It is my pleasure. Thank you for sharing the stdout here.

rhiever commented 7 years ago

Please re-open this issue and let us know if you find out the Python environment setting that seemed to cause TPOT 0.7 to freeze (or stop working). If it is a reproducible issue, then we will endeavor to patch it ASAP.

amir-abdi commented 7 years ago

I have the same problem with new version of TPO

weixuanfu commented 7 years ago

@amir-abdi Could you please provide more details about the problem, like codes for reproducing the issue? Or could you please try the dev branch. It may give you some error message. Please try the command for installing dev branch

pip install --upgrade --no-deps --force-reinstall git+https://github.com/rhiever/tpot.git@development
amir-abdi commented 7 years ago

@weixuanfu The code is as simply as

pipeline_optimizer = TPOTClassifier(generations=5, 
                                   population_size=20,
                                   cv=5,
                                   random_state=42,
                                   verbosity=2)
weixuanfu commented 7 years ago

Could you please let me know more details about your case? For example, system environment (OS, python version, TPOT version) ? Also, did you reproduce the issue with IRIS or MINST example in the link ?