airctic / icevision

An Agnostic Computer Vision Framework - Pluggable to any Training Library: Fastai, Pytorch-Lightning with more to come
https://airctic.github.io/icevision/
Apache License 2.0
850 stars 150 forks source link

Models and their configurations #60

Closed oke-aditya closed 4 years ago

oke-aditya commented 4 years ago

🚀 Feature

using torch lightning.

lgvaz commented 4 years ago

Following the excellent lightning architecture, something to keep in mind is to avoid hiding pytorch details from the user, this is the single most loved feature from lightning

That being said, I do think we can do something similar from what you're suggesting, and it would be great if it worked both for FasterRCNN and MaskRCNN

Currently the creating of the backbone model happens inside the constructor:

def __init__(self, ...):
    self.m = fasterrcnn_resnet50_fpn(pretrained=self.pretrained, **kwargs)

What do you think about injecting this via an argument? Something like:

def __init__(self, backbone=None):
    self.m = backbone or fasterrcnn_resnet50_fpn(pretrained=self.pretrained)

I like this way because we avoid the problem of unsunsupported Backbone and we don't hide pytorch from the user

lgvaz commented 4 years ago

We also need to be careful with model_splits, this is used for differential learning rates, independent of the method we decide to go with this, we somehow have to make sure model_splits is behaving as expected

oke-aditya commented 4 years ago

Slight issue in the backbone definition. To create the backbone we need to pass it to faster RCNN class of Pytorch then it gets created similar to ResNet 50 rpn.

Also when we add the backbone the major problem is ResNet 50 rpn by default given is trained on COCO while backbones when we add are trained on ImageNet

lgvaz commented 4 years ago

To create the backbone we need to pass it to faster RCNN class

Yeah, that's okay, we can easily refactor the code to work like that

About the pre-trained weights, we might in some cases even want to start from scratch, so it's fine to start from ImageNet, as long as is clear to the user what is happening.

What is important here, is that I think that if the user wants to change the backbone we want to make sure he has to write pytorch code, no magic should happen inside our classes

What we should do is write helper function/classes so that he has to write less boilerplate, that's all

oke-aditya commented 4 years ago

That sounds good. It would make it easier.

lgvaz commented 4 years ago

Feel free to open a PR implementing these changes, I'll give you freedom to design the API as you like, just follow the principles discussed here 🎉

Ow, we don't have a contributing guide yet, but we take inspiration from lightning contribution guide you can find here

For code formatting, use black, all defaults, simple

That's it, happy coding 😉

lgvaz commented 4 years ago

Can you provide a high level example (with code just like I did before) on how that would look like?

oke-aditya commented 4 years ago

I'm thinking a bit on backbones. To create a seperate folder for backbones and keep inheriting those models. Few reasons why.

  1. Torchvision lacks backbones such as efficient net, inception resnet etc. This would again limit us (later refactoringis very hard)
  2. Will give more granular control and force the user to edit the backbone Pytorch models in our folder not go to torchvision or outside library. Heavily drawn from this https://www.github.com/qfgaohao/pytorch-ssd/tree/master/vision We could have more backbones. It will allow users to extend it easily (edited) 2:36 Many cases we need layers that aren't in torch. How to handle it? With this we can have seperate layers folder. Only these 2 would be in torch. Remaining all in Lightning

On a high level view we would have structure of folders like this

mantishrimp
-> backbones (containing backbones like mobilenet etc) 
-> layers (layers that aren't in torch e.g. dilated convolution, this is used by by both backbones and models)
-> models (which contain folder for each e.g. ssd, fasterrcnn, detr)

This will allow the user to specify the backbone and subsequent flexible model is built.

def __init__(self, backbone=None):
    self.m = backbone or fasterrcnn_resnet50_fpn(pretrained=self.pretrained)

will change a bit to

def __init__(self, user_backbone=None):
        ft_backbone = user_backbone.features
        ft_bakbone.out_channels = (as in model)
    self.m = FasterRCNN(backbone = ft_backbone)

This FasterRCNN or so models should come from folder of models.

So, it will simplify as follows, backbones, layers, and models are in Pytorch (no other alternative) and then our trainer, engines, and user access will be with Pytorch Ligthning

It will allow us to create a seperate file like I did, model.py for initializing model with user defined backbones and hyperparameters, which would be passed to engine.py which is controlled by pytorch ligthning.

End user would need lesser pain in manually defining.

oke-aditya commented 4 years ago

here instead of taking self.feature extractor from torchvision it would come from our custom backbones in model folder, they would be written with Pytorch lightining image

Extend this to object detection.

oke-aditya commented 4 years ago

Closing this for now, as it would be more clear in #61