Closed nicolebussola closed 2 years ago
@nicolebussola
Hi, Nicole, thanks for your interest.
As we mentioned in Section 4 of the paper, we set the baseline input (X_b) to zero for simplicity. In our implementation, we directly set f(X_b) to zero, as it is a constant no matter what kind of X_b is chosen. The difference is minor in our observation. Hope this helps.
Dear Haofan, thank you for your quick reply.
We find your method very reliable and robust, but we have some questions that we have been working on for some time and we think you could help us. Let me briefly explain our problem: we have a classifier trained on a binary problem that actually classifies the presence or absence of a certain feature. As an example, suppose we have images with bike+person and ones with bikes only, and we are interested in finding images with people in it.
We would really appreciate your opinion,
Best
@nicolebussola
If I understand correctly, you have a binary classifier (w/wo person) in your example.
For Q1, the heatmap for "No Person" is possible to be messy.
For Q2, an underlying reason is that your model is not discriminative enough. In ideal case, the result of Score-CAM can work as a coarse detector. I'm not sure why you get two similar maps. I suggest you visualize the activation maps and their linear weights for each class (w/wo).
Not active. Closed.
@nicolebussola
If I understand correctly, you have a binary classifier (w/wo person) in your example.
For Q1, the heatmap for "No Person" is possible to be messy.
For Q2, an underlying reason is that your model is not discriminative enough. In ideal case, the result of Score-CAM can work as a coarse detector. I'm not sure why you get two similar maps. I suggest you visualize the activation maps and their linear weights for each class (w/wo).
I am @nicolebussola colleague. Thank you very much. Your insights have been very useful. We are looking into that. If we get to some interesting result about this, we will give you an update.
Hi, I was looking at the computation of coefficients for the activation maps:
In the paper (and in the comment), you refer to the
Increase of confidence
, so thescore
should be computed as the difference between the score of the raw input and the score of the masked input. However, looking at this implementation we understand that the score is just the one predicted on the masked input. Am I missing something?Thank you Nicole