facebookresearch / detectron2

Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
https://detectron2.readthedocs.io/en/latest/
Apache License 2.0
30.41k stars 7.47k forks source link

Feature Request: Make Data Preprocess Configurable #697

Closed xmyqsh closed 4 years ago

xmyqsh commented 4 years ago

🚀 Feature

Make data preprocess configurable, including data format transformation and data augmentation and so on.

Motivation

The training pipeline, data loader, distribute handling is clean and easy to use except the dataset mapper and build_transform_gen. It will be best to make this part to be configurable to make it easy to support more tasks.

Specifically: First, decouple dataset mapper and transformers. Second, make transformers configurable.

Sanster commented 4 years ago

For my use case, it is very useful to make dataset mapper configurable

iskode commented 4 years ago

Any plan on these, I'm interesting making the image loading configurable. As in my case I'm working with medical images, cv2 nor PIL are the useful. My idea is about passing a method for loading the data from data_dict info. Specifications are just on the return array: type /size. Exple: [W, H, 3] uint8. Can you validate the idea so that I process further ?

ppwwyyxx commented 4 years ago

My idea is about passing a method for loading the data from data_dict info

You can already do so: https://detectron2.readthedocs.io/tutorials/data_loading.html#write-a-custom-dataloader Let me know if this is not what you want.

xmyqsh commented 4 years ago

In a word, make this line configurable will be OK.

image, transforms = T.apply_transform_gens([T.Resize((800, 800))], image)

But I prefer to be stupid.

In spite of elegant arch, detectron2 is also considering easy debugging by friendly API design.

iskode commented 4 years ago

My idea is about passing a method for loading the data from data_dict info

You can already do so: https://detectron2.readthedocs.io/tutorials/data_loading.html#write-a-custom-dataloader Let me know if this is not what you want.

Yes that's what I was looking for. Sorry again for not digging down the doc. Finally my idea seems more complicated. But I'll share anyway. I wanted to add an optional key (e.g: '"specs") in the dataset_dict which holds a dictionary of custom functions for reading the data file_name and the seg_sem_file_name and return correct format and type. Then the mapper will use them if present otherwise fall back to defaults:

class DatasetMapper:

    def custom_reader(self, dataset_dict):
          image_reader = cv2.imread
          label_reader = Image.open
          specs = dataset_dict.get('specs', None)
          if specs != None:
                if specs.get('image_reader', None) != None:
                      image_reader = specs['image_reader']
                if specs.get('label_reader', None) != None:
                      label_reader = specs['label_reader']
        return image_reader, label_reader

    def __call__(self, dataset_dict):
          dataset_dict = copy.deepcopy(dataset_dict)
          image_reader, label_reader = self.custom_reader(dataset_dict)
          image = image_reader(dataset_dict["file_name"])
          .....
          # USER: Remove if you don't do semantic/panoptic segmentation.
         if "sem_seg_file_name" in dataset_dict:
                 sem_seg_gt = label_reader( dataset_dict["sem_seg_file_name"])
          .....

As I'm writing this, I think it can be easy to customize also transformations via the specs dict.