Add a hill-climbing algorithm as an Optuna sampler

toshihikoyanase commented 3 months ago

Motivation

Hill climbing is a local search algorithm commonly used for optimization problems. The sampler based on the hill-climbing algorithm aims to provide a simple yet effective way to explore the hyperparameter space and find good solutions.

Description

For simplicity, this hill-climbing sampler will only support the discrete case, where hyperparameters are represented as integers or categorical values.

The key components of the hill-climbing sampler will include:

Initialization: This sampler will start from a random point in the hyperparameter space using RandomSampler of Optuna
Lists neighboring points: The sampler will maintain two lists: one for neighboring points to be evaluated and another for those that have been evaluated.
Move: This sampler will move in the direction of steepest ascent based on the current point and the neighboring points
Restart: This sampler will restart the search when no neighboring points improve objective values

Alternatives (optional)

No response

Additional context (optional)

No response

csking101 commented 6 days ago

Hi @toshihikoyanase , could I take up this issue?

toshihikoyanase commented 6 days ago

Hi @csking101 . Thank you for your interest. Please go ahead!

csking101 commented 6 days ago

Hey @toshihikoyanase , so I wanted to share my approach, and get feedback for the same:

I will implement a Sampler from the BaseSampler
I will store the following variables in that - evaluated_points(list), remaining_points(list), best_points(list)
My search space is a discretized version of space in the respective distribution, so I will take a point and its neighbours will be the all the points offset from the current_point in all the distributions (so for d dimensions, there will be 2*d neighbours). Is this correct for the relative sampling method? So pretty much we move one direction at a time, or should it be considering multiple axes' points at a time?
The rest of the algorithm remains the same, I will consider which neighbour gives the best value as per the objective, and make that my current_point, I will repeat this until there is no improvement
When this happens, I will start a new trial from another random point, with no relation to the previous trials (however, I will store the previous best points from the trials). Should I give some maximum limit for the number of trials?

Is this the right path?

toshihikoyanase commented 2 days ago

@csking101 Thank you for sharing the design of the sampler.

I will implement a Sampler from the BaseSampler

Yes, you can use BaseSampler. Alternatively, you may use SimpleBaseSampler, which simplifies the implementation. This tutorial provides the usage of SimpleBaseSampler.

I will store the following variables in that - evaluated_points(list), remaining_points(list), best_points(list)

Sounds good. evaluated_points might not be necessary since Study holds all evaluated points as Study.trials.

My search space is a discretized version of space in the respective distribution, so I will take a point and its neighbours will be the all the points offset from the current_point in all the distributions (so for d dimensions, there will be 2*d neighbours). Is this correct for the relative sampling method? So pretty much we move one direction at a time, or should it be considering multiple axes' points at a time?

Please move a single axis at a time because this issue targets straightforward hill climbing. As you mentioned, this approach can involve evaluating a large number of neighbours, it should remain manageable for simple problems.

The rest of the algorithm remains the same, I will consider which neighbour gives the best value as per the objective and make that my current_point, I will repeat this until there is no improvement

Sounds reasonable.

When this happens, I will start a new trial from another random point, with no relation to the previous trials (however, I will store the previous best points from the trials).

Sounds good! Please restart from a randomly chosen point.

Should I give some maximum limit for the number of trials?

There's no need to set a limit within the sampler, as users can specify the maximum number of trials via Study.optimize.

csking101 commented 1 day ago

Thank you for the reply, I will open a PR shortly!

optuna / optunahub-registry