Hello author, I used different teacher models in pre-training, namely medsam_checkpoint,sam_vit_b_checkpoint, and the other one did not use the teacher model. As follows, I found that different teacher models differ greatly in pre-training. medsam_checkpoint loss is 18.9 at the beginning and 4.2 after 300 epoches training, which is not as good as not using the teacher model at the beginning. Because my data is medical images, the structure of sam_vit_b_checkpoint model and medsam_checkpoint model are exactly the same, which makes me feel confused. I hope you can give me some tips, thank you for your reply
Hello author, I used different teacher models in pre-training, namely medsam_checkpoint,sam_vit_b_checkpoint, and the other one did not use the teacher model. As follows, I found that different teacher models differ greatly in pre-training. medsam_checkpoint loss is 18.9 at the beginning and 4.2 after 300 epoches training, which is not as good as not using the teacher model at the beginning. Because my data is medical images, the structure of sam_vit_b_checkpoint model and medsam_checkpoint model are exactly the same, which makes me feel confused. I hope you can give me some tips, thank you for your reply