ray-project / tune-sklearn

A drop-in replacement for Scikit-Learn’s GridSearchCV / RandomizedSearchCV -- but with cutting edge hyperparameter tuning techniques.
https://docs.ray.io/en/master/tune/api_docs/sklearn.html
Apache License 2.0
465 stars 52 forks source link

Allow any searcher, fixes #198

Closed Yard1 closed 3 years ago

Yard1 commented 3 years ago

This PR:

There are no API breaking changes and CI should pass.

The reason why a Searcher subclass is passed instead of an object is that I can't see a good way of handling various parameters inside it. For example, someone may pass an object with _metric and _mode already set which would cause issues if those keys were passed to tune.run - but it's also not possible to just check for those attributes as there's no requirement of them being there. Forcing the user to pass a class instead (while still allowing them to instantiate it with custom kwargs) seemed like the best solution.

richardliaw commented 3 years ago

Sorry for the slow reply!

The reason why a Searcher subclass is passed instead of an object is that I can't see a good way of handling various parameters inside it. For example, someone may pass an object with _metric and _mode already set which would cause issues if those keys were passed to tune.run

If I had a custom Searcher, how would I pass it in to TuneSearchCV with custom arguments?

I actually think we could just check if object.metric / object.mode is set, and fail if they are?

Yard1 commented 3 years ago

@richardliaw You'd pass custom Searcher args as kwargs in TuneSearchCV - however now that we've talked it through it will probably be much cleaner to pass Searcher objects instead and enforce those attributes on them. I'll change it after holidays here :)

Yard1 commented 3 years ago

I had a look at it and using objects is not as straightforward.

The issue is that if a user passes an algo specific search space, it can't be passed directly to tune.run (like the ConfigSpace object for BOHB). The handling for that is usually present in the Searcher's __init__ - and it also sets attributes like metric_op and such, which can vary from algo to algo and cannot all be accounted for here. It is thus not possible to simply change the _metric and _mode attributes and pass the changed object to tune.run.

Therefore, if we want to allow users to pass Searcher instances, instead of classes as I first proposed, we would need to force the user to use only Tune search spaces with them.

What are your thoughts on solving this? I agree that passing classes instead of instances is not intuitive but I am not sure what'd be a solution that would not have us prevent allowing algo specific search space objects.

This is how passing a Searcher class with arguments would look like:

search = TuneSearchCV(search_optimization=MySearcher, searcher_argument_1=True) # kwargs are passed to search_optimization __init__

CC @krfricke @richardliaw

richardliaw commented 3 years ago

@Yard1 could you help me understand the following questions? Under the object searcher proposal:

  1. Will users be able to provide a string argument for the searcher?
  2. If using a custom searcher object, why would the user not be able to just pass the custom config space via the searcher object?
  3. What are the breaking changes made to user experience by using the object searcher proposal?
Yard1 commented 3 years ago

@richardliaw

  1. Yes, that is unchanged in either case (no API breaking changes)
  2. Yes, but they will also need to initialize the object with metric and mode alongside the space. All those attributes would be detached from the space, metric and mode arguments in TuneSearchCV, as it will be not possible to set them in a way that will cover all possible cases. There are a lot of algo specific interactions and derived attributes that are set on initialization - for example, many Searchers set the metric_op attribute during initialization depending on the _mode attribute. We could technically make cases for all Searcher classes in Ray Tune but it will be a nightmare to maintain and it will not account for the possibility of a user making their own Searcher.
  3. As described in 2, it would completely detach the parameters set in TuneSearchCV with what's set on the Searcher object. I can imagine a situation where someone would forget to initialize a new Searcher instance with a different metric, changing it only in TuneSearchCV. For the same reason, ensuring consistency just by raising exceptions in case of a mismatch will not be robust either (though more doable than setting them, as derived attributes will not need to be accounted for).
richardliaw commented 3 years ago

One use case I'd like to support is the search algorithm restoration. This allows people to use learnings from past optimization jobs to speed up new ones.

This would require access to the search algorithm object (Searcher.restore()).

Ideally, we would have the following:

# 1. Base usage if using a custom searcher
searcher = Searcher()
TuneSearchCV(searcher, config=config, metric=X, mode=Y)

# 2. RAISES ERROR if metrics are set in searcher
searcher = Searcher(metric=X, mode=Y)
TuneSearchCV(searcher, metric=X, mode=Y)

# 3. custom search space, OK
searcher = Searcher(config=config)  # is this possible?
TuneSearchCV(searcher, metric=X, mode=Y)

# 4. RAISES error if both config is set.
searcher = Searcher(config=config) 
TuneSearchCV(config=config, searcher=searcher, metric=X, mode=Y)

# 5. Restore from previous run
searcher = Searcher()
searcher.restore("path")
TuneSearchCV(config=config, searcher=searcher, metric=X, mode=Y)

My understanding is that 3 is the controversial one? Though actually a quick skim of code tells me that you can actually pass space without setting metric/mode for many search algs.

Yard1 commented 3 years ago

So there are two ways of configuring searchers. Either through set_search_properties called by tune.run or by specifying the necessary parameters in __init__. The issue is that for all the Tune searchers, if the instance is initialized with the space, then the set_search_properties will fail and tune.run will raise an exception. You can see this behavior in eg. the Optuna Searcher:

class OptunaSearch(Searcher):
    def __init__(self,
                 space: Optional[Union[Dict, List[Tuple]]] = None,
                 metric: Optional[str] = None,
                 mode: Optional[str] = None,
                 points_to_evaluate: Optional[List[Dict]] = None,
                 sampler: Optional[BaseSampler] = None):
        # ...

        if isinstance(space, dict) and space:
            resolved_vars, domain_vars, grid_vars = parse_spec_vars(space)
            if domain_vars or grid_vars:
                logger.warning(
                    UNRESOLVED_SEARCH_SPACE.format(
                        par="space", cls=type(self)))
                space = self.convert_search_space(space)

        self._space = space
        # ...
        if self._space:
            self._setup_study(mode)

    def set_search_properties(self, metric: Optional[str], mode: Optional[str],
                              config: Dict) -> bool:
        if self._space:
            return False
        space = self.convert_search_space(config)
        self._space = space
        if metric:
            self._metric = metric
        if mode:
            self._mode = mode

        self._setup_study(mode)
        return True

If the space is set in __init__, then it is not possible to set metric and mode through set_search_properties (it returns False, which causes tune.run to raise an exception). All the other Searchers have the same behavior. Therefore, there are only two ways to configure them:

  1. If space, metric and mode are to be set through set_search_properties called by tune.run, then the Searcher has to be initialized without specifying the space, so that set_search_properties can set all three,
  2. Alternatively, all three attributes must be set in __init__.

It is not possible to eg. set just the space in __init__ and then metric and mode by tune.run as that will raise an excception. On the other hand, tune.run only supports dicts as search spaces so it is not possible to pass algo-specific search space objects to it, leaving only the second choice available.

I see two solutions:

  1. Force everything to go through tune.run - will be the simplest, most straightforward and easiest to maintain option but it will make it impossible to eg. pass a ConfigSpace object directly to BOHB, forcing the user to recreate it as a dict of Tune search spaces
  2. If the searcher has been initialized with a space already, then raise exceptions if metrics and mode in TuneSearchCV are different - this is not 100% robust, but should work fine for most cases.

I hope this clears it up, sorry for the confusion. The gist is that it's only possible to either set space, metric and mode all at once in __init__ of a Searcher, or none at all.

richardliaw commented 3 years ago

Hmm ok thanks for the clarification!

What about we go with:

If the searcher has been initialized with a space already, then raise exceptions if metrics and mode in TuneSearchCV are different - this is not 100% robust, but should work fine for most cases.

But also make it so that we don't actually need to specify it twice (i.e., you can leave the one in TuneSearchCV empty?)

Yard1 commented 3 years ago

So the properties defined in the Searcher should overwrite the ones in TuneSearchCV? The ones in TuneSearchCV also used for early stopping (Scheluders).

richardliaw commented 3 years ago

So the properties defined in the Searcher should overwrite the ones in TuneSearchCV? The ones in TuneSearchCV also used for early stopping (Scheluders).

Ah, no. Ok to make it simple, how about you just make sure everyone has to have the same for now, and we can think about overriding later?

TuneSearchCV mode/metric has to be set, and Scheduler/Searchers will also have to be set if passed in.

Yard1 commented 3 years ago

Sure, let me get on that

Yard1 commented 3 years ago

It's now using instances while enforcing consistency between TuneSearchCV and the Searcher.

richardliaw commented 3 years ago

@Yard1 could you look into the test and see if this is related?

Yard1 commented 3 years ago

@richardliaw CI is failing due to Optuna - I believe the version is wrong, I'll check - but it shouldn't be related to this.

Yard1 commented 3 years ago

@richardliaw Mode is now configurable and release CI passes (it was installing a version of Optuna that was too old)

richardliaw commented 3 years ago

Great work @Yard1 !