THU-MIG / RepViT

RepViT: Revisiting Mobile CNN From ViT Perspective [CVPR 2024] and RepViT-SAM: Towards Real-Time Segmenting Anything
https://arxiv.org/abs/2307.09283
Apache License 2.0
681 stars 55 forks source link

About distillation #52

Open muxin-wei opened 3 months ago

muxin-wei commented 3 months ago

Thanks for your great work! I've trained a SAM with TinyViT as image encoder, and am trying to replace it with Rep-ViT-M2.3, image_embedding of which is imposed to be the same. To gain a good performance, is it effective to just perform KD from TinyViT-SAM while copying and freezing the mask decoder following MobileSAM? Can you provide an example of how to perform KD with my trained SAM as the teacher model?