haofanwang / Score-CAM

Official implementation of Score-CAM in PyTorch
MIT License
399 stars 66 forks source link

I have some issues of your score-cam paper,looking forward to your answer #15

Closed lijing-coder closed 3 years ago

lijing-coder commented 3 years ago

In section 4.2 of the experiment part of the paper,there is a sentence saying "In this experiment, rather than do point-wise multiplication with the original generated saliency map, we slightly modify by limiting the number of positive pixels in the saliency map."Could you explain how you did this experiment?Compared with grad-cam++, which parts have you modified?

haofanwang commented 3 years ago

Hi, @lijing-coder, thanks for your interest.

We don't use the original saliency map directly, as we found that our map is much cleaner than previous works (which means that the number of pixels been manipulated is different). Thus, in our work, we limit the number of pixels.

For example, there are 100 positive pixels in Score-CAM map, but 1000 positive pixels in Grad-CAM map. It is unfair if we directly do point-wise multiplication, right? So we can restrict that only the top-N pixels will be influenced.

lijing-coder commented 3 years ago

Thanks for your answer. There is a sentence saying "50% of pixels of the image are muted in our experiment" in your paper , so I wonder if the value N in the top-N is 50% or some other certain value? If N is a certain value, what is the value of N in your paper?

haofanwang commented 3 years ago

As I do experiment with VGG, where the input is of size 224x224 (The channel is ignored here, our saliency map is of size 1x224x224). So we mute 0.5x224x224 pixels (sorted, these positions with lower values are muted) in the saliency map before we do multiplication. Hope this helps.

lijing-coder commented 3 years ago

Yes,thanks very much, it is most helpful.

lijing-coder commented 3 years ago

Dear author. Can I refer to your experimental code? I have done many experiments ,but failed to get the experimental results of your paper. As I select the top-50% pixels of cam-heatmap before multiplication, I get the experimental resualt is great bigger than you. My email is 836341536@qq.com. Thanks again.

haofanwang commented 3 years ago

Sorry, I don't have the experimental code on my hand now. Below is the way I did the experiment, hope this can help.

raw_output = F.softmax(model(input_))
mask_scorecam = vgg_scorecam(input_, class_idx=image_label)
mask_scorecam[mask_scorecam < torch.median(mask_scorecam)] = 0
score_output = F.softmax(model(input_ * mask_scorecam))
drop_score = float(max(0, raw_output[0][image_label] - score_output[0][image_label]) /  raw_output[0][image_label])

Please let me know if you have further questions

haofanwang commented 3 years ago

Not active. Closed.

Feel free to re-open it if your problem has not been solved.

lijing-coder commented 3 years ago

I want to know which version of grad-cam and grad-cam++ you used in you paper's experiment. Now, different cam-zoo have different purchase of extracting features and pre-process of image, so I don't know which version should be used, and I have test many versions , but I can't achieve your experiment's result.