Closed drishyamlabs closed 4 years ago
Thanks for this question!
In paper, we describe our method as d logit / d cav, but in code, for the convenience, we do d (loss) / d cav. Since it's loss, this is opposite of logit - the lower loss is, the more likely p(x), logit is. That's why we take the negative dot product, just flip it to the right direction. Hope this helps!
Been
Hi,
Thanks a lot for sharing the valuable codes. I have few basic questions:
As per the research paper, concept vector is orthogonal to the decision boundary. Can you please guide us where in the code is that happening? In the implementation (https://github.com/tensorflow/tcav/blob/master/tcav/tcav.py) line 86, tcav score is defined as "TCAV score (i.e., ratio of pictures that returns negative dot product wrt loss)." Can you please tell us why we are taking a negative dot product as the positive influence- It will really help solve my confusion. Looking forward to getting response soon.