longcw / yolo2-pytorch

YOLOv2 in PyTorch
1.54k stars 421 forks source link

"torch.utils.data.dataloader" instead of "multiprocessing.Pool.imap" #52

Open AmIr-KinG opened 6 years ago

AmIr-KinG commented 6 years ago

Hi Has anyone tried writing a data loader using torch.utils?? I'm wondering what do I need to change in order to do so.

ankwinters commented 6 years ago

According to the page http://pytorch.org/docs/master/_modules/torch/utils/data/dataset.html#Dataset , you need to implement the following methods: __getitem__ and __len__

class Dataset(object):
    """An abstract class representing a Dataset.

    All other datasets should subclass it. All subclasses should override
    ``__len__``, that provides the size of the dataset, and ``__getitem__``,
    supporting integer indexing in range from 0 to len(self) exclusive.
    """
    def __getitem__(self, index):
        raise NotImplementedError

    def __len__(self):
        raise NotImplementedError

    def __add__(self, other):
        return ConcatDataset([self, other])

This is a sample showing what I have done before.

class Dataset(torch.utils.data.Dataset):
    def __init__(self, img_dir, label_dir):
        # Load images & labels
        self.image_files = [f for f in glob.glob(img_dir + '/*.bmp')]
        self.label_files = [label_dir + '/' + f.split('/')[-1].split('.')[0] + '.bmp'
                          for f in self.image_files]
        # Do something

    def __len__(self):
        return len(self.image_files)

    def __getitem__(self, idx):
       # Do something
       image = self.images[idx]
       label = self.labels[idx]
       return image, label