ain-soph / trojanzoo

TrojanZoo provides a universal pytorch platform to conduct security researches (especially backdoor attacks/defenses) of image classification in deep learning.
https://ain-soph.github.io/trojanzoo
GNU General Public License v3.0
274 stars 62 forks source link

Code for ImageNet has bugs #157

Closed TDteach closed 2 years ago

TDteach commented 2 years ago

In the #171 and #172 lines of trojanvision/datasets/imagefolder.py,

idx_to_class = {i: name for name, i in dataset.class_to_idx.items()}
return [idx_to_class[i] for i in range(len(idx_to_class.keys()))]

have bug.

Specifically, the torchvision.datasets.ImageNet instance will return a class_to_idx dict that contains only 997 different classes (this is becuase, in ImageNet datset, there are duplicated classes name (e.g. class #134 and class #517 are both named by crane )). Thus, idx_to_class[i] will cause key-not-found error due to the fact that len(idx_to_class.keys()) = 977 and 134 is not in idx_to_class.

I suggest to replace theses two lines by

        if isinstance(dataset, datasets.ImageNet):
            idx_to_class = {dataset.wnid_to_idx[wnid]: str(clss) for wnid, clss in zip(dataset.wnids, dataset.classes)}
        else:
            idx_to_class = {i: name for name, i in dataset.class_to_idx.items()}
        return [idx_to_class[i] for i in range(len(idx_to_class.keys()))]
ain-soph commented 2 years ago

Strange that it didn't happen in old versions. Maybe the recent update of torchvision causes the issue.

I think it might be a better idea to overload this method in ImageNet class rather than make a special case in the father class ImageFolder.

ain-soph commented 2 years ago

Thanks for proposing this issue. I just uploaded a new commit and it's good on my side. I'd appreciate if you could test it as well.

If it's okay, I'll publish a new minor version update.

TDteach commented 2 years ago

Wonderful, this commit fixed this bug. Please publish a new minor version update.