Open Beliefzp opened 11 months ago
I want to know whether the label input is important to the model, so I delete the label input in my ablation study. This is my way:
_x1 = _x1 / _x1.norm(dim=-1, keepdim=True) l_fea1_like = torch.ones_like(l_fea1) ### this is what I add logits_per_image1 = logit_scale1 * _x1 @ l_fea1.t().float() out1 = logits_per_image1.view(imshape[0][0], imshape[0][2], imshape[0][3], -1).permute(0, 3, 1, 2) cam1 = out1.clone().detach() cls1 = self.pooling(out1, (1, 1)).view(-1, l_fea1.shape[0])
From abrove, we can know that I add one line of code“l_fea1_like = torch.ones_like(l_fea1)”,In order to modify the code as little as possible, I used an identity matrix with the same size as l_fea1 to replace the original l_fea1, so that it seems that l_fea1 is not used.
But when I do the ablation study, I find that the loss funtion is not work, the loss function does not decrease, which seems to indicate that the model has not learned any information.
So I want to know why, and I also want to know the author how to do the ablation?
Thank you for your interest in our work! When conducting the ablation study, I did not use label features and image features to calculate similarity for obtaining the 4HW feature map. Instead, I first transformed the features outputted by the encoder into 4 channels using a 1*1 convolution, and then applied Global Average Pooling (GAP) to obtain category predictions.
I added a convolution layer to the model:
self.conv_head = nn.Conv2d(in_channels=final_dimension, out_channels=self.num_classes, kernel_size=1, bias=True)
In the forward function, it is implemented as:
x = self.conv_head(x) cam = x.detach().clone() out = self.pooling(x, (1, 1)).view(-1, self.num_classes)
Additionally, during our actual training, we set the loss weight for stage 1 to 0.0, so it is reasonable that modifying the code for stage 1 did not have any effect.
I want to know whether the label input is important to the model, so I delete the label input in my ablation study. This is my way:
_x1 = _x1 / _x1.norm(dim=-1, keepdim=True) l_fea1_like = torch.ones_like(l_fea1) ### this is what I add logits_per_image1 = logit_scale1 * _x1 @ l_fea1.t().float() out1 = logits_per_image1.view(imshape[0][0], imshape[0][2], imshape[0][3], -1).permute(0, 3, 1, 2) cam1 = out1.clone().detach() cls1 = self.pooling(out1, (1, 1)).view(-1, l_fea1.shape[0])
From abrove, we can know that I add one line of code“l_fea1_like = torch.ones_like(l_fea1)”,In order to modify the code as little as possible, I used an identity matrix with the same size as l_fea1 to replace the original l_fea1, so that it seems that l_fea1 is not used. But when I do the ablation study, I find that the loss funtion is not work, the loss function does not decrease, which seems to indicate that the model has not learned any information. So I want to know why, and I also want to know the author how to do the ablation?
Thank you for your interest in our work! When conducting the ablation study, I did not use label features and image features to calculate similarity for obtaining the 4HW feature map. Instead, I first transformed the features outputted by the encoder into 4 channels using a 1*1 convolution, and then applied Global Average Pooling (GAP) to obtain category predictions.
I added a convolution layer to the model:
self.conv_head = nn.Conv2d(in_channels=final_dimension, out_channels=self.num_classes, kernel_size=1, bias=True)
In the forward function, it is implemented as:x = self.conv_head(x) cam = x.detach().clone() out = self.pooling(x, (1, 1)).view(-1, self.num_classes)
Additionally, during our actual training, we set the loss weight for stage 1 to 0.0, so it is reasonable that modifying the code for stage 1 did not have any effect.
Thank you very much! I have solved it! Thank you.
I want to know whether the label input is important to the model, so I delete the label input in my ablation study. This is my way:
From abrove, we can know that I add one line of code“l_fea1_like = torch.ones_like(l_fea1)”,In order to modify the code as little as possible, I used an identity matrix with the same size as l_fea1 to replace the original l_fea1, so that it seems that l_fea1 is not used.
But when I do the ablation study, I find that the loss funtion is not work, the loss function does not decrease, which seems to indicate that the model has not learned any information.
So I want to know why, and I also want to know the author how to do the ablation?