when should we end the Bandit Optimization

cyrilmyself commented 8 months ago

hello，if we run MAB experiment，when should we stop the experiment

mgarrard commented 8 months ago

Hi @cyrilmyself -- thanks for reaching out! Could you provide a bit more context and detail about your usecase, goals, parameters, and objective/constraints?

cyrilmyself commented 8 months ago

I use botorch to run Bandit Optimization as https://ax.dev/tutorials/factorial.html，what i want to know is when should i stop the optimization。if i did not stop，the experiment will run forever。So how to set the condition of stopping the bandit optimization

mgarrard commented 7 months ago

Hi @cyrilmyself, usually folks have a few different ways that they assess when to stop the optimization, but the underlying thought process is the same: run trials, typically in batches, and then assess if there is convergence/plateauing. Eventually, there are diminishing returns for running additional trials, and this is usually a good place to stop the experiment. Also, typically folks are constrained on the number of trials they can feasibly run, so that is another consideration. Leveraging visualizations is a great and intuitive tool for deciding when to complete the experiment.

In the case of the Bandit Optimization tutorial, you can see the rollout process visualization shows that as more trials are ran, there are only 4 arms that are still being considered, down from over 20 arms in the first trial. We choose to run 4 trials in this code block. For a learning experience, it could be interesting to play around with running more or less trials by modifying that code block to see how number of trials effects how many arms are still being considered. In practice, using some balancing between the number of trials you are able to run, and the output after running x number of trials to determine if you should continue the optimization is a good way to go about it.

In other setups, such as Hyperparmaeter Optimization for Pytorch, you can plot the optimization trace to identify when there is a plateau in model improvement given more iterations. And we also offer more advanced global stopping strategies that you can implement to more "smartly" stop your experiment early if additional trials are unlikely to be beneficial.

Is this helpful?

mgarrard commented 7 months ago

Closing the issue, please feel free to re-reach out if something remains unclear. Have a great weekend @cyrilmyself :)

facebook / Ax

when should we end the Bandit Optimization #2146