Dlux804 / McQuade-Chem-ML

Development of easy to use and reproducible ML scripts for chemistry.
5 stars 1 forks source link

Smart Hyperparameter Optimization #17

Closed Dlux804 closed 4 years ago

Dlux804 commented 4 years ago

Is your feature request related to a problem? Please describe. Currently, we only have random or brute force ways to sample the hyperparameter space. This is extremely inefficient and does not give much confidence in the resulting tuning.

Describe the solution you'd like There are methods out there for tuning hyper parameters that take into account the other samples. Bayesian optimization is one such approach. We should implement from a pre-written function.

Describe alternatives you've considered Storing all the random hyper parameters that we have tried for each model and excluding them from future searches. Same with grid. Such that we don't re-compute any points and that we continually log all over hyper parameter space and can then just sort by score to find the best.

Additional context Github repo with some working code. And another one. This one will also optimize which model type is best for the task.
This one also seems promising.

Dlux804 commented 4 years ago

I am using skopt BayesSearchCV() It has a compatibility issue with the newer versions of scikit-learn that has been fixed by the community. See this Github repo. It can be installed with:

pip install git+https://github.com/darenr/scikit-optimize