Open mvandenhi opened 2 weeks ago
(also as a side-note, there are a lot of hardcoded paths, which make reproducibility hard)
Thank you for your interest in our work and for pointing out the clarification needed regarding the loss function.
Thank you for pointing out the loss function issue. I forgot to clarify this part. I simplified it in Algorithm 1, omitting the threshold to keep the method easy to understand. In practice, the threshold can help make the training process more stable. When the generated mask is near-ideal (loss2 is much bigger than loss1), setting the second loss term to zero can avoid artifacts.
For this loss: loss = (loss1 - self.weight_a * loss2 + l1_loss) optimizers[0].zero_grad() optimizers[1].zero_grad() loss.backward() optimizers[0].step() optimizers[1].step()
Do you calculate gradients directly for both the predictor and the selector? (loss.backward()) I think I encountered some errors when I tried that during my experiment. Does that work out correctly for you?
For the ViT experiments: In the file: models/COMET_net.py You can find this comment: ''' self.predictor = timm.create_model('vit_small_patch16_224', pretrained=pretrained) self.predictor.head = torch.nn.Linear(self.predictor.head.in_features, num_classes)
self.completement_pred = timm.create_model('vit_small_patch16_224', pretrained=pretrained) self.completement_pred.head = torch.nn.Linear(self.completement_pred.head.in_features, num_classes) ''' Note: I used ViT only for the predictor and the feature detector, while I continued using LRASPP as the feature selector. I’m guessing you used ViT for mask generation, which might explain the pixelation in the mask.
Apologies for any confusion. Please feel free to email me if you have further questions.
Hi @Zood123, I'm trying to replicate the results from your work. I have noticed that (as I see it), the loss function implemented in your repository differs from the one stated in the paper. That is, in train.py, line 186 (which is the line that computes the loss of your method, right?), the second summand differs from the version in the paper.
Additionally, would you agree that
loss = (loss1 - self.weight_a * loss2 + l1_loss)
optimizers[0].zero_grad()
optimizers[1].zero_grad()
loss.backward()
optimizers[0].step()
optimizers[1].step()
would be a valid loss computation?