Closed Yard1 closed 3 years ago
Sorry for the slow reply!
The reason why a Searcher subclass is passed instead of an object is that I can't see a good way of handling various parameters inside it. For example, someone may pass an object with _metric and _mode already set which would cause issues if those keys were passed to tune.run
If I had a custom Searcher, how would I pass it in to TuneSearchCV with custom arguments?
I actually think we could just check if object.metric
/ object.mode
is set, and fail if they are?
@richardliaw You'd pass custom Searcher args as kwargs in TuneSearchCV - however now that we've talked it through it will probably be much cleaner to pass Searcher objects instead and enforce those attributes on them. I'll change it after holidays here :)
I had a look at it and using objects is not as straightforward.
The issue is that if a user passes an algo specific search space, it can't be passed directly to tune.run
(like the ConfigSpace
object for BOHB). The handling for that is usually present in the Searcher's __init__
- and it also sets attributes like metric_op
and such, which can vary from algo to algo and cannot all be accounted for here. It is thus not possible to simply change the _metric
and _mode
attributes and pass the changed object to tune.run
.
Therefore, if we want to allow users to pass Searcher instances, instead of classes as I first proposed, we would need to force the user to use only Tune search spaces with them.
What are your thoughts on solving this? I agree that passing classes instead of instances is not intuitive but I am not sure what'd be a solution that would not have us prevent allowing algo specific search space objects.
This is how passing a Searcher class with arguments would look like:
search = TuneSearchCV(search_optimization=MySearcher, searcher_argument_1=True) # kwargs are passed to search_optimization __init__
CC @krfricke @richardliaw
@Yard1 could you help me understand the following questions? Under the object searcher proposal:
@richardliaw
metric_op
attribute during initialization depending on the _mode
attribute. We could technically make cases for all Searcher classes in Ray Tune but it will be a nightmare to maintain and it will not account for the possibility of a user making their own Searcher. One use case I'd like to support is the search algorithm restoration. This allows people to use learnings from past optimization jobs to speed up new ones.
This would require access to the search algorithm object (Searcher.restore()).
Ideally, we would have the following:
# 1. Base usage if using a custom searcher
searcher = Searcher()
TuneSearchCV(searcher, config=config, metric=X, mode=Y)
# 2. RAISES ERROR if metrics are set in searcher
searcher = Searcher(metric=X, mode=Y)
TuneSearchCV(searcher, metric=X, mode=Y)
# 3. custom search space, OK
searcher = Searcher(config=config) # is this possible?
TuneSearchCV(searcher, metric=X, mode=Y)
# 4. RAISES error if both config is set.
searcher = Searcher(config=config)
TuneSearchCV(config=config, searcher=searcher, metric=X, mode=Y)
# 5. Restore from previous run
searcher = Searcher()
searcher.restore("path")
TuneSearchCV(config=config, searcher=searcher, metric=X, mode=Y)
My understanding is that 3 is the controversial one? Though actually a quick skim of code tells me that you can actually pass space without setting metric/mode for many search algs.
So there are two ways of configuring searchers. Either through set_search_properties
called by tune.run
or by specifying the necessary parameters in __init__
. The issue is that for all the Tune searchers, if the instance is initialized with the space, then the set_search_properties
will fail and tune.run
will raise an exception. You can see this behavior in eg. the Optuna Searcher:
class OptunaSearch(Searcher):
def __init__(self,
space: Optional[Union[Dict, List[Tuple]]] = None,
metric: Optional[str] = None,
mode: Optional[str] = None,
points_to_evaluate: Optional[List[Dict]] = None,
sampler: Optional[BaseSampler] = None):
# ...
if isinstance(space, dict) and space:
resolved_vars, domain_vars, grid_vars = parse_spec_vars(space)
if domain_vars or grid_vars:
logger.warning(
UNRESOLVED_SEARCH_SPACE.format(
par="space", cls=type(self)))
space = self.convert_search_space(space)
self._space = space
# ...
if self._space:
self._setup_study(mode)
def set_search_properties(self, metric: Optional[str], mode: Optional[str],
config: Dict) -> bool:
if self._space:
return False
space = self.convert_search_space(config)
self._space = space
if metric:
self._metric = metric
if mode:
self._mode = mode
self._setup_study(mode)
return True
If the space is set in __init__
, then it is not possible to set metric and mode through set_search_properties
(it returns False
, which causes tune.run
to raise an exception). All the other Searchers have the same behavior. Therefore, there are only two ways to configure them:
set_search_properties
called by tune.run
, then the Searcher has to be initialized without specifying the space, so that set_search_properties
can set all three,__init__
.It is not possible to eg. set just the space in __init__
and then metric and mode by tune.run
as that will raise an excception. On the other hand, tune.run
only supports dicts as search spaces so it is not possible to pass algo-specific search space objects to it, leaving only the second choice available.
I see two solutions:
tune.run
- will be the simplest, most straightforward and easiest to maintain option but it will make it impossible to eg. pass a ConfigSpace object directly to BOHB, forcing the user to recreate it as a dict of Tune search spacesI hope this clears it up, sorry for the confusion. The gist is that it's only possible to either set space, metric and mode all at once in __init__
of a Searcher, or none at all.
Hmm ok thanks for the clarification!
What about we go with:
If the searcher has been initialized with a space already, then raise exceptions if metrics and mode in TuneSearchCV are different - this is not 100% robust, but should work fine for most cases.
But also make it so that we don't actually need to specify it twice (i.e., you can leave the one in TuneSearchCV empty?)
So the properties defined in the Searcher should overwrite the ones in TuneSearchCV? The ones in TuneSearchCV also used for early stopping (Scheluders).
So the properties defined in the Searcher should overwrite the ones in TuneSearchCV? The ones in TuneSearchCV also used for early stopping (Scheluders).
Ah, no. Ok to make it simple, how about you just make sure everyone has to have the same for now, and we can think about overriding later?
TuneSearchCV mode/metric has to be set, and Scheduler/Searchers will also have to be set if passed in.
Sure, let me get on that
It's now using instances while enforcing consistency between TuneSearchCV and the Searcher.
@Yard1 could you look into the test and see if this is related?
@richardliaw CI is failing due to Optuna - I believe the version is wrong, I'll check - but it shouldn't be related to this.
@richardliaw Mode is now configurable and release CI passes (it was installing a version of Optuna that was too old)
Great work @Yard1 !
This PR:
refit
parameter were not being enforced as intendedmetric
andmode
from Searchers/Schedulers totune.run
where possibleThere are no API breaking changes and CI should pass.
The reason why a Searcher subclass is passed instead of an object is that I can't see a good way of handling various parameters inside it. For example, someone may pass an object with
_metric
and_mode
already set which would cause issues if those keys were passed totune.run
- but it's also not possible to just check for those attributes as there's no requirement of them being there. Forcing the user to pass a class instead (while still allowing them to instantiate it with custom kwargs) seemed like the best solution.