Closed wangmingaaaaa closed 9 months ago
Hi @wangmingaaaaa
Unfortunately, no. I have tried it, but the performance did not show significant improvement with more than two teachers. The potential reason might be that they eventually converge to the same local minimum in the late stages of training, and more teachers speed up such process.
While a potential solution is a different optimisation way of the third teacher, and this paper might be relevant.
Cheers, Yuyuan
Thank you very much for your reply
Would it be better to integrate three teachers?