SparK for semantic segmentation

keyu-tian / SparK

[ICLR'23 Spotlight🔥] The first successful BERT/MAE-style pretraining on any convolutional network; Pytorch impl. of "Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling"

https://arxiv.org/abs/2301.03580

MIT License

1.41k stars 82 forks source link

SparK for semantic segmentation #71

Closed simonebonato closed 4 months ago

simonebonato commented 7 months ago

Hi and thanks for the amazing work :)

I would like to pre-train some models, among which a MaskRCNN with ResNet50-FPN backbone and a U-Net (and some variations), and I was wondering, can SparK be used also for the U-Net encoder, even if the task is not instance segmentation but semantic segmentation? Does it still make sense to try?

Thanks :)

keyu-tian commented 7 months ago

@simonebonato thank you.

I'm not an expert in semantic segmentation, but from what I know of some self-supervised papers, some will add a non-pre-trained UperNet head after a pre-trained backbone and then fine-tune them for semantic segmentation. Perhaps you could try to fine-tune our pre-trained ConvNeXt with https://github.com/facebookresearch/ConvNeXt/tree/main/semantic_segmentation.

For your idea, I believe it makes sense to pre-train a U-Net encoder. I also think it might be better if the whole UNet (including the decoder) is pre-trained together by SparK, but it'll require more efforts in implementation.

simonebonato commented 7 months ago

Ok then I can either try to take one of the pretrained ConvNeXt you provided, add a UperNet head and then finetune the entire thing, or I can try to use SparK directly on a a whole U-Net (encoder+decoder)?

I can try to do both and I will let you know in case I get something good :)

keyu-tian commented 7 months ago

Yeah they're two possible ways. Good luck with your experiments!