sicara / easy-few-shot-learning

Ready-to-use code and tutorial notebooks to boost your way into few-shot learning for image classification.
MIT License
1.03k stars 141 forks source link

Why do I get different a lower accuracy with EasySet? #146

Open TeddyPorfiris opened 5 months ago

TeddyPorfiris commented 5 months ago

When I import the Omniglot dataset and define the train and test set as follows, I get an accuracy of 98%.

# training set of Omniglot
train_set = Omniglot(
    root="./data",
    background=True,
    transform=transforms.Compose(
        [
            transforms.Grayscale(num_output_channels=3),
            transforms.RandomResizedCrop(image_size),
            transforms.RandomHorizontalFlip(),
            transforms.ToTensor(),
        ]
    ),
    download=True,
)

# test set
test_set = Omniglot(
    root="./data",
    background=False,
    transform=transforms.Compose(
        [
            # Omniglot images have 1 channel, but our model will expect 3-channel images
            transforms.Grayscale(num_output_channels=3),
            transforms.Resize([int(image_size * 1.15), int(image_size * 1.15)]),
            transforms.CenterCrop(image_size),
            transforms.ToTensor(),
        ]
    ),
    download=True,
)

But when I downloaded the Omniglot dataset images from an external website and used EasySet (jsons attached) to have it work with easyfsl, I get an accuracy of 79%.

train_json = "train.json"
test_json = "test.json"

train_set = EasySet(train_json)
test_set = EasySet(test_json)

Why is this? The rest of the code is the same for both situations (I am following the my_first_few_shot_classifier tutorial). Thanks a lot! test.json train.json

ebennequin commented 4 months ago

Without knowing the dataset you are using, there could be a number of reasons why the performances would drop. If you cannot trust your version of the Omniglot dataset, I strongly suggest you use the verified implementation from torchvision which we use in the notebook.