Closed justcho5 closed 4 years ago
Hi, as the concepts are extracted as patches inside the original images (and not the images themselves), this should not cause a problem. For the case of famous data sets like Imagenet, using random dataset images works fine. For a different dataset, if the images are such that the extracted patches are diverse enough, it should also be fine.
Hi,
I had a question about the CAV computation. I read the original paper and it looks like a CAV is computed by calculating the linear decision boundary between the concept and completely random images. For this implementation of ACE, won't using randomly chosen images from the dataset, as written in the README, make it difficult to find a linear boundary separating the concepts from the random dataset images since the concepts are derived from the dataset and will thus be represented in the random dataset images?
In the end, I'm unsure if I should collect random images (from google or imagenet) for the CAV computation or the random images from the dataset. Do you know whether the random dataset images do get separated well from concept images?
Thanks