Closed cobraheleah closed 2 years ago
Yes, we finetune the teacher with the BEiT fine-tuning recipe. Of course, we sweep enough large scope of hyper-parameters, to get a high performance. Because the original papers (CLIP/DINO) do not report the fine-tuning results.
Thanks, Zhiliang.
Hey, Thank you publishing the paper for BEiT2. I wonder know, in table 6 of the paper, what is Performance of teacher models means ?