hyperopt / hyperopt-sklearn

Hyper-parameter optimization for sklearn
hyperopt.github.io/hyperopt-sklearn
Other
1.57k stars 270 forks source link

Use numpy.random's Generator instead of legacy RandomState #177

Closed rharish101 closed 2 years ago

rharish101 commented 2 years ago

This adapts hyperopt-sklearn to the change in hyperopt#821, to fix hyperopt#829.

Note: Just like the original PR, this will lead to an API breakage, since numpy.random.Generator is not backwards-compatible with numpy.random.RandomState.

mandjevant commented 2 years ago

Thank you for your PR. I'm interested in implementing these changes but they are not suitable since the project has changed immensely. Would you mind re-doing these changes on the rewritten version that is now on the master branch?

Thanks a lot for supporting this project!

rharish101 commented 2 years ago

Sorry for the late reply, but sure, I'll try to do it.

rharish101 commented 2 years ago

@mandjevant I've re-done these changes onto the latest master. Could you review and run tests?

mandjevant commented 2 years ago

Thank you for your quick update @rharish101. I approved the workflow but it errored.

ValueError: Generator(PCG64) at 0x7F280A7ED040 cannot be used to seed a numpy.random.RandomState instance Let me know if the error makes sense or if you could use help fixing it :) Thanks again!

rharish101 commented 2 years ago

I've pushed the fixes. The error was due to scikit-learn only supporting the legacy RNG. I've added a workaround to give it a legacy RNG derived from hyperopt-sklearn's modern RNG.

Could you run the workflow again?

mandjevant commented 2 years ago

Thank you for the quick fix @rharish101 I find that the current test workflow is insufficient for usability. Therefore I'd like to test the usability this weekend locally.

I'll respond with my findings within 48 hours.

rharish101 commented 2 years ago

Sure, take your time 👍

mandjevant commented 2 years ago

Alright, I've tested all possible versions and confusions of the legacy generator and the new np generator.

And I think that the combination of

  1. passing a seed to the estimator instead of a Generator,
  2. having appropriate type hinting,
  3. explicit documentation on this by NumPy, ensure the usability of the module. So I am ready to merge this.

Thank you for your pull request @rharish101, I'm very excited that we are now able to use the latest (and greatest) Hyperopt version!

mandjevant commented 2 years ago

@bjkomer Would you like to review before I merge?

rharish101 commented 2 years ago

@bjkomer Looking at the type annotations, I thought that fit was just supposed to take a RandomState object. That's why I changed it to Generator. Do you instead want to keep backwards compatibility with RandomState while supporting both integers and Generator?

mandjevant commented 2 years ago

Great catch @bjkomer, glad you found this!

Tests are not approving due to an unrelated issue.

187 should fix this.

mandjevant commented 2 years ago

@rharish101 could you pull the latest commit so I can re-run the tests? Thanks in advance!

rharish101 commented 2 years ago

@mandjevant I just rebased my branch onto master. Feel free to go ahead with the tests.

mandjevant commented 2 years ago

Thank you for your contribution @rharish101!