Open ZihaoZheng98 opened 4 months ago
清华已经发了
Actually, MLLA uses self-distillation to train the model, making the results unfair to compare with other models.
https://github.com/LeapLabTHU/MLLA/blob/master/main.py#L264-L270
清华已经发了