jiaosiyu1999 / MAFT

46 stars 2 forks source link

What is the role of Stage 1 #4

Closed NanAlbert closed 5 months ago

NanAlbert commented 5 months ago

This is an interesting work! It appears that Stage 1 is designed to train a model capable of generating masks for Stage 2. Beyond supplying masks for Stage 2's MAFT training, does Stage 1 serve any additional purposes? If not, what is the reason behind training a model in Stage 1 instead of directly utilizing pre-trained models like FreeSeg or Mask2Former? Thank you!

jiaosiyu1999 commented 5 months ago

Thanks for your attention. The viewpoint that "Stage 1 is designed to train a model capable of generating masks for Stage 2" is correct.

We retrained FreeSeg because the official implementation of FreeSeg differs from the standard OV-Seg settings: FreeSeg is trained with the unified semantic, instance, and panoptic labels, whereas OV-Seg models are trained only on COCO-Stuff. Of course, you can also use other pre-trained models, e.g., SAM or MaskFormer, to generate masks.