Closed gagewrye closed 1 month ago
The head returns a tuple (outputs, loss). The loss is None if targets are not provided. targets is the second argument to the model.forward(images, targets) call, and the structure is task-specific, for segmentation the targets should be a long torch.Tensor BxHxW where the value is the class ID. If you want to customize how the loss is being computed then it is best to use a custom head, the head is not restored from checkpoint anyway.
Thank you for the fast reply!! I am trying to use a custom loss function, so I will use a custom head. Do you have any recommendations for a binary segmentation task? This is being used as part of an engineering effort at UCSD to monitor mangroves
I believe passing fpn=True enables both the Feature Pyramid Network as well as a UNet-style decoder which passes over the output features from the FPN from lowest resolution to highest resolution while upsampling. This feature map should be the first one in the list of feature maps returned by the model when fpn=True.
So I would simply use that feature map, and add a linear output layer to compute the logits.
Other suggestions:
The segmentation classifier outputs a tuple of the actual output tensor and None type. When I separate them in the model to get the output, it interrupts the gradient flow and prevents the model from training. This seems to be an issue with the head itself - it will train when I replace it with a custom head.