liznerski / fcdd

Repository for the Explainable Deep One-Class Classification paper
MIT License
225 stars 62 forks source link

Editing required custom dataset structure #30

Closed GreatScherzo closed 2 years ago

GreatScherzo commented 2 years ago

Hi! Is it possible to use a different custom dataset structure by editing the program?

As far as I understand, the current required custom dataset structure is as below.

└── custom
    ├── test
    │   └── [class name]
    │         ├── anomalous
    │         │   └── img1.bmp, img2.bmp, ....
    │         └── normal
    │              └── img_NOK1.bmp, img_NOK2.bmp, ....
    └── train
        └── [class name]
              ├── anomalous
              │   └── train_img1.bmp, train_img2.bmp, ....
              └── normal
                   └── train_img_NOK1.bmp, train_img_NOK2.bmp, ....

I would like to modify the program so that it would accept a dataset structure like below. Train and test dataset would not be in the same folder and both have the structure below. Class is completely removed.

└── [arbitrary dataset name]
    ├── OK
    │   └── img1.bmp, img2.bmp, ....
    │       
    └── NOK
        └── img_NOK1.bmp, img_NOK2.bmp, ....

If this can be done, what are the places that I should be modifying?

Thank you very much for taking your time!

liznerski commented 2 years ago

Hey. Sure, you just need to make the ADImageFolderDataset and the ImageFolderDataset find the images.

1) Set the train and test path here. 2) Then, in the init of the ImageFolderDataset here, you need to set self.samples. The current implementation uses the default PyTorch code of the ImageFolder class to find all sample paths. 3) Then, set the anomaly labels here. The current implementation reads the folder names, which wouldn't work for your folder structure.

GreatScherzo commented 2 years ago

Thank you for your fast reply! I'm currently trying to modify it out based on what you said.

I'll post the codes here for everyone's reference!

GreatScherzo commented 2 years ago

I was able to modify the custom dataset structure.

I override the find_classes method of ImageFolder class to return classes and class_to_idx to how I want it to be. Here, the self.samples will be what you want. Below shows what I did as a reference

# To enable ImageFolderDataset to read the new dataset structure, the inherited ImageFolder
# class is overridden
class ImageFolderModified(ImageFolder):
    # override find_classes
    def find_classes(self, directory: str) -> Tuple[List[str], Dict[str, int]]:

        # parent_dirname = os.path.dirname(directory)
        # classes = [os.path.basename(directory)]
        classes = CLASSES

        class_to_idx = {cls_name: i for i, cls_name in enumerate(classes)}
        return classes, class_to_idx

I also did step 1 and step 3 as mentioned by @liznerski

Thank you very much for your help @liznerski !