The paper on SegFormer suggests an All MLP decoder.
The SegformerHead.py shows the use of a Conv2D for the final layer.
Can you help me understand if this is a deviation from the paper or mentioned in a followup paper somewhere? I apologize in advance if there is an obvious answer.
1x1 conv is basically the same as linear operation. In segmentation, 1x1 conv is usually used as the last layer. You can also check the official segformer implementation.
The paper on SegFormer suggests an All MLP decoder.
The SegformerHead.py shows the use of a Conv2D for the final layer.
Can you help me understand if this is a deviation from the paper or mentioned in a followup paper somewhere? I apologize in advance if there is an obvious answer.