Open alanlukezic opened 3 years ago
That's interesting: i'm experimenting the same thing (RGB + 1 channel with additional information). No loss decrease at all on my side.
do you have a code sample for this? i'll would like to try this as well
def build_backbone(args):
position_embedding = build_position_encoding(args)
train_backbone = args.lr_backbone > 0
return_interm_layers = args.masks
backbone = Backbone(args.backbone, train_backbone, return_interm_layers, args.dilation)
for name, module in backbone.named_modules():
if(name == "body"):
module.conv1 = nn.Conv2d(input_image_channels, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
model = Joiner(backbone, position_embedding)
model.num_channels = backbone.num_channels
return model
@olivierp9 the above code block should do when added in the backbone.py. Just replace the 'input_image_channels' with the number of channels your dataset images may contain.
First, thanks for sharing this great work! I want to use DETR for object detection on images with 4 input channels (RGB + 1 channel with additional information). I modified the ResNet backbone, so that first conv layer (conv1) takes 4 input channels instead of 3 and copy the weight values of first 3 input channels from original conv1. I tried to switch on/off gradient propagation in first two backbone layers (which are originally not trained) and the loss decreases a bit after few epochs and then stays high. I also verified that the fourth channel is normalized and in the same range as the RGB channels. Any idea why the loss does not decrease as expected?