ray-project / ray

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
https://ray.io
Apache License 2.0
33.76k stars 5.74k forks source link

[tune] Save/restore searcher state. #8783

Closed drivkin closed 3 years ago

drivkin commented 4 years ago

I am using Tune and Nevergrad to perform black box optimization using PSO (particle swarm optimization). To do this, I create a NevergradSearch instance and pass in to the tune.run function with the search_alg keyword. This is where all of the state related implementing PSO search is contained. The run_or_experiment arg I pass is just a function that evaluates the state of my black box.

I would like to have the option to stop a search and resume it later. This pretty much just requires saving of the state of the search alg, which, for NevergradSearch, is already implemented here. However, to the best of my knowledge there is currently no way to get tune to "checkpoint and restore" this state the way it does for the state of run_or_experiment if your run_or_experiment is a subclass of Trainable.

I imagine that I could achieve this kind of functionality by running tune.run repeatedly, saving state between runs, but that seems clunky. I would propose to save/restore the state of the search_alg the same way the state of the run_or_experiment is saved/restored.

richardliaw commented 4 years ago

@drivkin thanks for making this issue. This is definitely a top requested feature and is on our todo list. We'll get to this in the next couple weeks.

manuels commented 3 years ago

Any updates on this issue?

richardliaw commented 3 years ago

I think this is fixed actually, most if not all searchers have their state saved upon execution.

Nithanaroy commented 3 years ago

@drivkin what version of ray and PSO did you use? I'm facing some compatibility issues with these libs. Thank you for the help.