Lightning-Universe / lightning-flash

Your PyTorch AI Factory - Flash enables you to easily configure and run complex AI recipes for over 15 tasks across 7 data domains
https://lightning-flash.readthedocs.io
Apache License 2.0
1.74k stars 212 forks source link

Customizable data pipeline for object detection #159

Closed reactivetype closed 3 years ago

reactivetype commented 3 years ago

🚀 Feature

I would like to have a flexible interface to customize dataset and data pipeline for object detection

Motivation

Thanks for creating this fantastic library. For research or application, I want to use different datasets other than CustomCOCODataset. There are two possible scenarios:

Is it possible to do any of these two scenarios? Can I swap the CustomCOCODataset with my custom LightningDataModule? Do we need to customize ObjectDetectionDataPipeline? I am not sure what the task pipeline is for. Some guideline would be appreciated. Thanks.

kaushikb11 commented 3 years ago

Hi, @reactivetype! Yes, that sounds great. As you can see currently, the flow for OD is like this:

datamodule = ObjectDetectionData.from_coco(
    train_folder="data/coco128/images/train2017/",
    train_ann_file="data/coco128/annotations/instances_train2017.json",
    batch_size=5
)

model = ObjectDetector(num_classes=datamodule.num_classes)

We could add support for more datasets by adding class methods to the ObjectDetectionData class. For eg., ObjectDetectionData.from_yolo(..), ObjectDetectionData.from_voc(..), etc.

Yes, you could pass transformations functions to the train_transform argument in ObjecDetectionData.from_coco.

The purpose of the DataPipeline is to provide the flow for the transformation of data using hooks. So, depending on your data requirements, you could tweak it by creating a Subclass of it.

But right now, we are doing a refactor on DataPipeline #141. Hence, the behavior could change but would be a better experience for the User! :)

edgarriba commented 3 years ago

@reactivetype DataPipeline is already merged. Please, check if that suits your use case. On the other hand, we are refactoring the data modules to make it more flexible and user friendly in front of custom data structures. Take a look at #256

edenlightning commented 3 years ago

Please feel free to reopen if needed!