pytorch / vision

Datasets, Transforms and Models specific to Computer Vision
https://pytorch.org/vision
BSD 3-Clause "New" or "Revised" License
16.05k stars 6.93k forks source link

Adding dataset Tiny-Imagenet #6127

Open towzeur opened 2 years ago

towzeur commented 2 years ago

πŸš€ The feature

Hello,

I would like to contribute to torchvision by providing a implementation of Tiny-Imagenet dataset.

home : https://www.kaggle.com/c/tiny-imagenet paper : http://vision.stanford.edu/teaching/cs231n/reports/2015/pdfs/yle_project.pdf zip : http://cs231n.stanford.edu/tiny-imagenet-200.zip

This challenge is part of Stanford Class CS 231N. Label Classes and Bounding Boxes are provided

details: classes : 200 image_size : 64x64x3 bbox : x0, y0, x1, y1 for each image train split : 100 000 (500 per class) val split : 10 000 (50 per class) test split : 10 000 (50 per class)

Motivation, pitch

Note: the original test split doesn't have targets and bboxes. Thus, in this implementation, I used the val split when passing train=True.

Features:

Structure:

root
β”œβ”€β”€β”€tiny-imagenet-200.zip
β”œβ”€β”€β”€tiny-imagenet-200
β”‚   β”œβ”€β”€β”€npy <-- generated
β”‚   β”‚       β”œβ”€β”€β”€test_bboxes.npy
β”‚   β”‚       β”œβ”€β”€β”€test_data.npy
β”‚   β”‚       β”œβ”€β”€β”€test_targets.npy
β”‚   β”‚       β”œβ”€β”€β”€train_bboxes.npy
β”‚   β”‚       β”œβ”€β”€β”€train_data.npy
β”‚   β”‚       β”œβ”€β”€β”€train_targets.npy
β”‚   β”œβ”€β”€β”€test
β”‚   β”œβ”€β”€β”€train
β”‚   β”œβ”€β”€β”€val
β”‚   β”œβ”€β”€β”€words.txt
β”‚   └───wnids.txt

Here the implementation: https://github.com/towzeur/vision/commit/a67feb569361f440fd48ed492183de8bd8f6b585

Alternatives

No response

Additional context

No response

cc @pmeier @YosuaMichael

datumbox commented 2 years ago

@towzeur Thanks for offering to help!

Let's get back to you on this. We are in the process of migrating to a new API, so it's unclear if at this point we will add more datasets on the old one. I've recorded your proposal on the RFC at #3562 so that we won't forget it. BTW note that putting the standard torchvision.datasets.ImageFolderon top of a folder that contains the extract data should work for Tiny Imagenet. We've been using it internally to check quickly models, so that's a quick workaround until it's properly added.

cc @NicolasHug

towzeur commented 2 years ago

Thanks for your response !

Yes we can use torchvision.datasets.ImageFolder but only for the "train" split which have the required structure (same semantic images grouped in the same dir,

1. tiny-imagenet-200\train\n01443537\images
...
200. tiny-imagenet-200\train\n12267677\images

Problem: ImageFolder would provide a unique class for all the images in the val and test set because images are grouped in a single "image" folder regardless of their semantic label

- tiny-imagenet-200\val\images 
- tiny-imagenet-200\test\images

So one have to parse the val_annotations.txt to retrieve the labels of val images. Note that their are annotations files as well for each category of the train set

1. tiny-imagenet-200\train\n01443537\n01443537_boxes.txt
...
200. tiny-imagenet-200\train\n12267677\n12267677_boxes.txt

Note 2. test split doesn't have an annotations file so it's a unsupervised set.

In my implementation, I parse all annotations files (train and val) as it can fix the previous problem. It can (if wanted) provide bboxes of the object in interest for each image.

I also think that having numpy file leads to faster loading as it bypass the file structure parsing and it can fit directly in memory. Combined, all the numpy files (images, targets, bbox) of the 'val' and 'train' split weights 1.26 Go.

There are 120 203 files that can be replaced by theses 6 numpy files. This can be advantageous in case where there is a disk quota, especially in the number of inodes.

datumbox commented 2 years ago

@towzeur I agree that adding the Tiny ImageNet is worth it. We need to see if this can be done on the current dataset API or on the new one. If it's the latter, it might take a bit more time so I would advice to parse the val annotations as you described to make it usable with ImageFolder. Concerning your point on the lack of annotations on test images, I believe that's something that was considered on the new Datasets API which is under development.

cc @NicolasHug @pmeier

lfolkerts commented 1 year ago

Here is a quick script to convert it to the new torch format:

import os
import re
def create_dir(base_path, classname):
    path = base_path + classname
    if not os.path.exists(path):
        os.mkdir(path)

def reorg(filename, base_path, wordmap):
    print(len(wordmap))
    with open('val/val_annotations.txt') as vals:
        for line in vals:
            vals = line.split()
            imagename = vals[0]
            print(vals[1])
            classname = wordmap[vals[1]]
            if os.path.exists(base_path+imagename):
                print(base_path+imagename, base_path+classname+'/'+imagename)
                os.rename(base_path+imagename,  base_path+classname+'/'+imagename)

wordmap = {}
with open('words.txt') as words, open('wnids.txt') as wnids:
    for line in wnids:
        vals = line.split()
        wordmap[vals[0]] = ""
    for line in words:
        vals = line.split()
        if vals[0] in wordmap:
            single_words = vals[1:]
            classname =  re.sub(",", "", single_words[0])
            if len(single_words) >= 2:
                classname += '_'+re.sub(",", "", single_words[1])
            wordmap[vals[0]] = classname
            create_dir('./val/images/', classname)
            if os.path.exists('./train/'+vals[0]):
                os.rename('./train/'+vals[0], './train/'+classname)
            #create_dir('./test/images/', single_words[0])
            #create_dir('./train/images/', single_words[0])

reorg('val/val_annotations.txt', 'val/images/', wordmap)