Open zlq147 opened 1 year ago
Hi @zlq147, thanks for your question! Here you have a more detailed explanation of the early termination, please let me know if this would be useful or if you have any other question.
Hi @zlq147, I wanted to follow up here! Was this information useful? Do you have any other question about this?
I would like to add that while the current documentation is filled out and reasonably complete, I find it very hard to understand. I don't believe there are enough examples to understand what is going on. Here's precisely what I find unclear:
min_iter
to be 3, is my only option for multiple brackets to set eta=2
and stop at epoch 3 or 9?min_iter
, brackets, iterations (steps, epochs or something else) and eta, but I guess that there's no way to make it much simpler as there are just a lot of variables and names here. I do think more examples would help here. Also if it were reiterated in the examples that the brackets correspond to logging interval (steps, epochs) it would make it more clear because it makes it more concrete in the users head for whatever their particular use case is. I am currently traveling and am not set up for a PR but could do this starting next week if there's interest and someone could review. Thank you.
Hi @rbracco, sorry for the long delay here! Just wanted to let you know that I submitted your feedback internally to improve our docs and better explain how the early terminate module works. Thanks a lot for sharing the detailed explanation!
I'd also love to see some concrete examples without having to dive into the paper!
Hi @JackCai1206, thanks for sharing the feedback! I'll share this with our team!
Same here. The docs are not clear IMHO.
Hi @luisbergua , just a follow up. Is there any progress on this issue? Facing same issue as stated above.
Hi
@luisbergua any updates?
Hi @SuroshAhmadZobair @ziimiin14, apologies for the delay. I'll bump the priority of this with our Docs Team
I am trying to use wandb sweep to tune the hyperparameter in a model, and also try to use the hyperband early terminate method to accelerate it.
However, I don't understand how this mechanism works by looking up the docs https://docs.wandb.ai/guides/sweeps/define-sweep-configuration#early_terminate and the paper https://arxiv.org/abs/1603.06560.
In this paper, the author propose the concept of "resource". In my opinion, in the wandb setting, the "resource" should be num of training epochs. However, in the configuration of "early terminate", I can only see the parameter of "s", "eta", "min_iter" and "max_iter". And through the explaination of the docs, I do not understand the real meaning of them.
In the github examples, it is tough to see whether the early terminate takes effect, so I hope there will be a simple piece of code to explain how the early terminate works. I wonder if the logged metric shourld be "valid_acc".
I would be appreciated if anyone can help me understand what early terminate mechanism in wandb sweep actually do, especially the meaning of the parameters, and how to change the training code.