czg1225 / SlimSAM

SlimSAM: 0.1% Data Makes Segment Anything Slim
Apache License 2.0
248 stars 14 forks source link

Question about del_pos_init and get_pos_init #10

Open tasakim opened 5 months ago

tasakim commented 5 months ago

Hi! Could you please tell me what is the meaning of del_pos_init and get_pos_init? There is no such operations in the original torch_pruning when pruning ViT or Swin transformer. Why is positional embedding removed here and what is the effect of this on pruning?

czg1225 commented 5 months ago

Hi, @tasakim , Thanks for the issue. The positional embedding approach of SAM's encoder is too complex for the dependency detection algorithm we used in pruning. As a result, we remove the positional embedding before pruning and add zero-initialized positional embedding after pruning. The performance of the compressed encoder will not be significantly affected after post-distillation.