mhamilton723 / STEGO

Unsupervised Semantic Segmentation by Distilling Feature Correspondences
MIT License
711 stars 142 forks source link

Regarding Loss Function using ground truth labels in linear_loss part #89

Open SonalKumar95 opened 7 months ago

SonalKumar95 commented 7 months ago

Hello there,

Please let me know if I'm wrong. In the train_segmentation.py, the loss function includes two extra loss functions, i.e., linear_loss and cluster_loss. For unsupervised training, the cluster_loss seems okay, but the linear_loss uses the ground truth labels. Is it the case or I'm getting it wrong? Please help me.

Thanks in advance.

mhamilton723 commented 7 months ago

Hey Sonal, thats okay because theres a detach and clone before the linear loss. Its basically just an entirely separate probe component whose gradients dont alter the training of the unsupervised components.

On Wed, Jan 10, 2024 at 1:42 AM Sonal Kumar @.***> wrote:

Hello there,

Please let me know if I'm wrong. In the train_segmentation.py, the loss function includes two extra loss functions, i.e., linear_loss and cluster_loss. For unsupervised training, the cluster_loss seems okay, but the linear_loss uses the ground truth labels. Is it the case or I'm getting it wrong? Please help me.

Thanks in advance.

— Reply to this email directly, view it on GitHub https://github.com/mhamilton723/STEGO/issues/89 or unsubscribe https://github.com/notifications/unsubscribe-auth/ABRIKPPBI2BLGJEU3TT4EYTYNYZ4RBFKMF2HI4TJMJ2XIZLTSOBKK5TBNR2WLJDUOJ2WLJDOMFWWLO3UNBZGKYLEL5YGC4TUNFRWS4DBNZ2F6YLDORUXM2LUPGBKK5TBNR2WLJLJONZXKZNENZQW2ZNLORUHEZLBMRPXI6LQMWBKK5TBNR2WLJDUOJ2WLJDOMFWWLLTXMF2GG2C7MFRXI2LWNF2HTLDTOVRGUZLDORPXI6LQMWSUS43TOVS2M5DPOBUWG44SQKSHI6LQMWVHEZLQN5ZWS5DPOJ42K5TBNR2WLKJUGY3TCNBSHEZDLAVEOR4XAZNFNFZXG5LFUV3GC3DVMWVDEMBXGM3TAMZRGAYKO5DSNFTWOZLSUZRXEZLBORSQ . You are receiving this email because you are subscribed to this thread.

Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub .

bio-mlhui commented 1 month ago

I am also confused. The detached_code = torch.clone(model_output[1].detach()) will generate detached_code which does not requires grad. However, the linear_output = self.linear_model(detached_code) will generate linear_output which requires grad. Since linear_output is used to compute linear_loss which uses ground truth mask labels, the final loss will backward its gradient to the linear_model. Does this mean the final algorithm is not unsupervised?