Open tasakim opened 5 months ago
Hi, @tasakim , Thanks for the issue. The positional embedding approach of SAM's encoder is too complex for the dependency detection algorithm we used in pruning. As a result, we remove the positional embedding before pruning and add zero-initialized positional embedding after pruning. The performance of the compressed encoder will not be significantly affected after post-distillation.
Hi! Could you please tell me what is the meaning of del_pos_init and get_pos_init? There is no such operations in the original torch_pruning when pruning ViT or Swin transformer. Why is positional embedding removed here and what is the effect of this on pruning?