Open artest08 opened 2 years ago
Additionally, I want to emphasize that, as far as I understand, res2 features are not utilized in the deformable segmentation head, only the res3 features with stride 2. Therefore, the best resolution becomes stride 16 which might lead to deterioration in the expected results, I think.
Hello,
First of all, thank you for your great work.
I want to ask a point in the implementation of input projection before the MaskHeadSmallConv in segmentation.py . The implementation applies stride 2 to the features which makes the best stride 8. However, for the segmentation tasks, it is possible to get better result when the stride 4 is utilized for the mask creation. The original segmentation head implementation of DETR also utilizes in that way. Therefore, I want to ask that what is the reason for utilizing that stride in your implementation?
Thanks in advance