ray-project / tune-sklearn

A drop-in replacement for Scikit-Learn’s GridSearchCV / RandomizedSearchCV -- but with cutting edge hyperparameter tuning techniques.
https://docs.ray.io/en/master/tune/api_docs/sklearn.html
Apache License 2.0
465 stars 52 forks source link

[Bug] HyperOpt optimization yields same results for both positive and negative loss values. #183

Closed Filco306 closed 3 years ago

Filco306 commented 3 years ago

Hello,

First of all, thank you for a nice repository. I am trying to use your package for some experiments of mine, namely using TuneSearchCV in one of the experiments.

I have built a custom sklearn estimator with its custom scoring function.

My code:

param_dists = {
            "param_1" : tune.choice([1,4,2,3]), 
            "param_2" : tune.randint(2,20),
        }

        # This is a bit of a hack, but we just create
        # ONE split and then use tune_sklearn's TuneSearchCV which allows
        # Different algorithms

        split = PredefinedSplit([0] * len(X))

        print("Starting search. ")
        tuner = TuneSearchCV(
            Optimizer(),
            param_distributions=param_dists,
            n_trials=20,
            max_iters=1,
            search_optimization="hyperopt",
            cv=split,
            n_jobs=4,
            use_gpu=False,
            random_state=seed,
        )

As I understand it, the scoring is based on my scoring function, so the score-function is a pre-defined function I have built returning a loss for which lower is better. First, it seemed it was not optimizing, so I tried changing it to return negative values of the same loss function, and it returns the exact same results (except that they are negative). In other words, it seems as if hyperopt is not optimizing at all.

Would you know what the problem is? Can it have to do with the fact that I only use one single split?

Thanks!

richardliaw commented 3 years ago

Hmm, the first few optimization steps of hyperopt are going to be random (to get a good idea of the loss landscape). After 20 or 30 steps, you should see optimization -- so maybe set n_trials to 100?

Filco306 commented 3 years ago

Ah, I see, that might be it then! I will try and get back to you. Thank you!