Open jsrdcht opened 1 week ago
Thanks for your reminder. I will work on it as soon as possible.
In fact, maskdistill is based on beit-2, but the pretraining/finetuing hyper-params are somewhat different. Can you describe the gap/some details/problem when you replicate it?
A year later, unimim still has not released the source code. A year ago, I spent a long time trying to replicate and analyze this paper, but was unable to do so. Are there any plans to make the source code of this paper available?
To be honest, I am a bit disappointed.