SWIN will not train with Head.SEGMENT

gagewrye commented 1 month ago

The segmentation classifier outputs a tuple of the actual output tensor and None type. When I separate them in the model to get the output, it interrupts the gradient flow and prevents the model from training. This seems to be an issue with the head itself - it will train when I replace it with a custom head.

favyen2 commented 1 month ago

The head returns a tuple (outputs, loss). The loss is None if targets are not provided. targets is the second argument to the model.forward(images, targets) call, and the structure is task-specific, for segmentation the targets should be a long torch.Tensor BxHxW where the value is the class ID. If you want to customize how the loss is being computed then it is best to use a custom head, the head is not restored from checkpoint anyway.

gagewrye commented 1 month ago

Thank you for the fast reply!! I am trying to use a custom loss function, so I will use a custom head. Do you have any recommendations for a binary segmentation task? This is being used as part of an engineering effort at UCSD to monitor mangroves

favyen2 commented 1 month ago

I believe passing fpn=True enables both the Feature Pyramid Network as well as a UNet-style decoder which passes over the output features from the FPN from lowest resolution to highest resolution while upsampling. This feature map should be the first one in the list of feature maps returned by the model when fpn=True.

So I would simply use that feature map, and add a linear output layer to compute the logits.

Other suggestions:

Freezing the backbone for some iterations sometimes improves performance but usually doesn't change much.
I would try a few other pretrained models that might be available for the image type you are using.
But overall in most applications we've found that it's most important to focus on the annotations. Improving the quality of the annotations while also scaling up the annotation efforts is often the cheapest way to get better performance. You can annotate a lot of data in just five to ten hours.

allenai / satlaspretrain_models

SWIN will not train with Head.SEGMENT #12