Closed zhanglei1172 closed 2 years ago
Hi @zhanglei1172 Thanks for the interest in our work and the proactive feedback.
I'll address the 2 points raised by you below:
< self.max_SH_iter
since promotions (as designed in DEHB) could occur during the later brackets but within the first HB bracket or as written in the comments, the first set of SH brackets. The exact scenario where promotion ceases and evolution begins across all subpopulations depend on the exact setting of the min-max budget and eta
. In DEHB, promotions happen until all subpopulations (including the highest fidelity) have all its population sizes filled with an evaluated configuration coming from a lower rung (or randomly sampled). However, promotion
s end after the first HB bracket or as we call it in the paper, the Initialization bracket.self.iteration_counter
was chosen to represent the main outer iteration in an unambiguous manner. Even in the paper we call each SH bracket as one iteration where the count is incremented continually. Since in our methods, we overload the term bracket as an SH bracket
, an HB bracket
or a DEHB bracket
, it made sense to decouple the main loop as iteration
to not have any confusion.I would happy to hear from you if this answers your concerns. Alternatively, we could close the PR if this suffices. Cheers!
Thanks a lot for your reply, but < self.max_SH_iter
still confuses me.
I took a close look at the algorithm section in the DEHB paper. It is divided into iteration
and bracket counter
in the algorithm, and pomotion
occurs when bracket counter ==0
is explicitly written in line 11 of the algorithm. I admit that self.iteration_counter
in the repository represents the SH bracket
. If self.iteration_counter== self.max_SH_iter
, then a complete hyperband
process has been completed. And all subpopulations have been evaluated after the first SH bracket
(has different rungs) has been completed. Can you explain a little more about the difference between the algorithm and the code? thanks again.
Can you explain a little more about the difference between the algorithm and the code? thanks again.
Firstly, I myself had to look it up again and found myself going through the paper and reading Appendix C.1 at the end of Page 9. I think reading that along with the pseudo-code makes sense and it does what the code does. Now I must admit that the exact nomenclature might not have been carried forward to the code owing to implementation details. I hope you understand that.
I'll try my best to summarize briefly the parallels of the code and the pseudo-code (algo).
The bracket_counter
in the algo counts the outer iterations or the main DEHB brackets. In the code self.iteration_counter
instead counts the SH bracket numbers. That relates to the termination_condition check here.
For self.iteration_counter
, the increment happens at the same level as an SH bracket. This would be equivalent to putting bracket_counter += 1
in the inner scope at L21.
However, for the algo, the bracket_counter
represents a DEHB/HB bracket and bracket_counter is 0
represents the Initialization bracket.
In L7 i
represents the rungs of the SH bracket.
The way to read L11 would be that while in the Initialization bracket, for every SH bracket, we do promotions from the second rung onwards. Since the first rungs are either randomly sampled or obtained through vanilla-DE search.
To translate the same into the actual code, self.iteration_number < self.max_SH_iter
is the accurate representation of being within the Initialization bracket.
I must mention again that the looping shown in the psueo code may not correspond exactly to the code. For the parallel design or for some optimization, our class design, functions, data structures will have certain variations with the pseudo-code. Even though, the pseudo-code is always a good valid first reference. I would recommend reading the paper and relating that to the code as it is the same logic.
Hope that I understood and addressed your concerns!
In this code
self.iteration_counter
represents bracket counter, so direct promotion should only be done when the first bracket. I think also it would be better to be able to renameself.iteration_counter
toself.bracket_counter
.