automl / DEHB

https://automl.github.io/DEHB/
Apache License 2.0
71 stars 16 forks source link

fix promotion bug #16

Closed zhanglei1172 closed 2 years ago

zhanglei1172 commented 2 years ago

In this code self.iteration_counter represents bracket counter, so direct promotion should only be done when the first bracket. I think also it would be better to be able to rename self.iteration_counter to self.bracket_counter.

Neeratyoy commented 2 years ago

Hi @zhanglei1172 Thanks for the interest in our work and the proactive feedback.

I'll address the 2 points raised by you below:

I would happy to hear from you if this answers your concerns. Alternatively, we could close the PR if this suffices. Cheers!

zhanglei1172 commented 2 years ago

Thanks a lot for your reply, but < self.max_SH_iter still confuses me. image

I took a close look at the algorithm section in the DEHB paper. It is divided into iteration and bracket counter in the algorithm, and pomotion occurs when bracket counter ==0 is explicitly written in line 11 of the algorithm. I admit that self.iteration_counter in the repository represents the SH bracket. If self.iteration_counter== self.max_SH_iter, then a complete hyperband process has been completed. And all subpopulations have been evaluated after the first SH bracket(has different rungs) has been completed. Can you explain a little more about the difference between the algorithm and the code? thanks again.

Neeratyoy commented 2 years ago

Can you explain a little more about the difference between the algorithm and the code? thanks again.

Firstly, I myself had to look it up again and found myself going through the paper and reading Appendix C.1 at the end of Page 9. I think reading that along with the pseudo-code makes sense and it does what the code does. Now I must admit that the exact nomenclature might not have been carried forward to the code owing to implementation details. I hope you understand that.

I'll try my best to summarize briefly the parallels of the code and the pseudo-code (algo).

The bracket_counter in the algo counts the outer iterations or the main DEHB brackets. In the code self.iteration_counter instead counts the SH bracket numbers. That relates to the termination_condition check here. For self.iteration_counter, the increment happens at the same level as an SH bracket. This would be equivalent to putting bracket_counter += 1 in the inner scope at L21. However, for the algo, the bracket_counter represents a DEHB/HB bracket and bracket_counter is 0 represents the Initialization bracket.

In L7 i represents the rungs of the SH bracket. The way to read L11 would be that while in the Initialization bracket, for every SH bracket, we do promotions from the second rung onwards. Since the first rungs are either randomly sampled or obtained through vanilla-DE search.

To translate the same into the actual code, self.iteration_number < self.max_SH_iter is the accurate representation of being within the Initialization bracket.

I must mention again that the looping shown in the psueo code may not correspond exactly to the code. For the parallel design or for some optimization, our class design, functions, data structures will have certain variations with the pseudo-code. Even though, the pseudo-code is always a good valid first reference. I would recommend reading the paper and relating that to the code as it is the same logic.

Hope that I understood and addressed your concerns!