Model creation better be independent of external configs?

zhantaochen commented 1 year ago

Just a general concern that currently models like ResNet2RotMat cannot be created without the _CONFIG. I still feel it is better to allow people create them by simply passing through args and kwargs.

I am going to inject _CONFIG through args and kwargs, feel free to let me know what you think @carbonscott

class ResNet2RotMat(nn.Module):
    def __init__(self, size=50, pretrained=False):
        super().__init__()
        resnet_type = f"resnet{size}"
        self.backbone = ImageEncoder(resnet_type, pretrained)

        # Create the adapter layer between backbone and bifpn...
        self.backbone_to_bifpn = nn.ModuleList([
            DepthwiseSeparableConv2d(in_channels  = in_channels,
                                     out_channels = _CONFIG.BIFPN.NUM_FEATURES,
                                     kernel_size  = 1,
                                     stride       = 1,
                                     padding      = 0)
            for _, in_channels in _CONFIG.BACKBONE.OUTPUT_CHANNELS.items()
        ])

        self.bifpn = BiFPN(num_blocks   = _CONFIG.BIFPN.NUM_BLOCKS,
                           num_features = _CONFIG.BIFPN.NUM_FEATURES,
                           num_levels   = _CONFIG.BIFPN.NUM_LEVELS)

        self.regressor_head = nn.Linear(_CONFIG.REGRESSOR_HEAD.IN_FEATURES, _CONFIG.REGRESSOR_HEAD.OUT_FEATURES)

zhantaochen commented 1 year ago

I am planning to make some big changes about how the models are dependent on configuration files. The preliminary plan is to make the train_xxx.py heavily rely on configurations, while the rest of code being flexible and decoupled from configurations. Please let me know what you think :)

carbonscott commented 1 year ago

@zhantaochen Agreed. Your idea of managing the training through configuration while using arguments to set up the model itself presents a clearer approach.

For retrospect, I suppose one advantage of utilizing _CONFIG in the model configuration is that it minimizes the need for significant interface changes. For example, I can keep the interface as size=50 and pretrained=False.

Overall, let's continue iterating it to find a balance between argument-passing and using configurator.

carbonscott commented 1 year ago

Another retrospective thought: I had considered limiting the exposure of BiFPN to users once an optimal set of hyperparameters for the BiFPN layer had been determined. In this regard, a more refined approach might be to introduce a flag indicating whether BiFPN is employed, replacing the current method of checking the number of blocks used by the BiFPN layer (0 means BiFPN is not in use).

zhantaochen / neurorient

Model creation better be independent of external configs? #10