Closed Yuxin-CV closed 4 years ago
@Yuxin-CV Unfortunately, we use design B ...
@Yuxin-CV Unfortunately, we use design B ...
Thanks for your reply. I think it is better to make it clear in the paper. Becaue as mentioned in the paper, there are 3 heads: Classification Head, Controller Head & Center-ness Head, and I really can't figure out how they organized from the paper...
BTW, will the code of CondInst be released recently? I found it is really hard for me to reproduce the results...
@Yuxin-CV Our paper is in submission, so we won't release the code until our paper gets accepted. You may refer to other implementation of CondInst.
@Yuxin-CV Our paper is in submission, so we won't release the code until our paper gets accepted. You may refer to other implementation of CondInst.
Thanks for your reply.
I am also interested in the loss behavior of FCOS part in CondInst, e.g., cls_loss & reg_loss. Compared with FCOS, is the loss become higher or lower in CondInst? The information is quite helpful for me to debug. Thanks!
@Yuxin-CV I cannot find the log files now, but I think the detector's losses should be lower because of the improved detection performance.
@Yuxin-CV I cannot find the log files now, but I think the detector's losses should be lower because of the improved detection performance.
Thanks for your suggestion~ I will check my code.
@Yuxin-CV Our paper is in submission, so we won't release the code until our paper gets accepted. You may refer to other implementation of CondInst.
Hi~ @tianzhi0549 My implementation of CondInst is built upon https://github.com/Epiphqny/CondInst, and I find a bug in this codebase:
The locations
's N
dim and L
dim are not transposed, so this when the batch_size_per_GPU > 1
, the implementation of rel. coord. is wrong and this will degenerate the results.
I fix this bug and I get improved results compared with my previous results: https://github.com/aim-uofa/AdelaiDet/issues/39#issue-604484494
Resolution of Mask Prediction | Box AP | Mask AP |
---|---|---|
1 / 8 | 39.5 | 33.0 (-1.4) |
1 / 4 | 39.5 | 31.7 (-4.1) |
1 / 2 | 39.5 | 34.5 (-1.2) |
The results in different Resolution of Mask Prediction shows similar & reasonable Box AP (39.5), but the Mask AP is abnormal, especially for the 1/4 resolution case. So I think at least there are some problems in the alignment of mask feature during training (I use thealigned_bilinear
as you @tianzhi0549 mentioned in https://github.com/Epiphqny/CondInst/issues/1) & the postprocessing. To avoid the feature alignment issue and study the postprocessing part of my code, I focus on the 1/8 resolution case and I modify the postprocessing code in
to
masks = masks_per_image[0, :, :self.image_sizes[0][0] // 8, :self.image_sizes[0][1] // 8].sigmoid()
The masks then rescale to the original image resolution (using F.interpolate
). This gives 0.6 boost in Mask AP (33.6), but there is still 0.8 AP gap for the 1/8 resolution case.
Also, there is a 0.2 AP gap for Box(https://github.com/aim-uofa/AdelaiDet/issues/20#issuecomment-610162815). This indicates that there must be some problems in the training code, probably in the feature alignment of mask prediction & GT (but I already use the aligned_bilinear
during training...) or in the GT preparation process(most probably, I think...).
So I wonder could you @tianzhi0549 please help me with the above problems? Can you provide some detail information or some code snippets of the feature alignment, postprocessing & the GT preparation process? I really need your help...
@Yuxin-CV We have released the code of BlendMask. CondInst is implemented with the same codebase. I think it should be helpful to you. Also, a hint is if the performance degradation is due to misalignment, you should see much more performance degradation on small objects than on large objects.
@Yuxin-CV We have released the code of BlendMask. CondInst is implemented with the same codebase. I think it should be helpful to you. Also, a hint is if the performance degradation is due to misalignment, you should see much more performance degradation on small objects than on large objects.
Thanks for your suggestions. @tianzhi0549 I mod my code and I get the improved results.
Resolution of Mask Prediction | Box AP | Mask AP | APs | APm | APl |
---|---|---|---|---|---|
1 / 8 | 39.5 | 33.6 (-0.8) | 14.7 (-0.4) | 37.9 (0.5) | 49.0 (-1.8) |
1 / 4 | 39.5 | 34.7 (-1.0) | 16.2 (-0.8) | 38.3 (-1.0) | 49.5 (-1.6) |
1 / 2 | 39.5 (-0.2) | 35.0 (-0.7) | 17.0 (-0.1) | 38.6 (-0.5) | 49.3 (-0.9) |
It seems that now the bottleneck is not APs (misalignment) for the 1 / 2 case. For now,
Could you please give me some suggestions?
BTW, I want to make sure that:
Design A: P3 feature -> Conv(256, 128) -> 4 x Conv(128, 128) -> Conv(128, 8)
Design B: P3 feature -> Conv(256, 128) -> 3 x Conv(128, 128) -> Conv(128, 8)
mask FCN head
, before computing the Dice Loss
, is it right?Thanks! @tianzhi0549
@Yuxin-CV 1) The mask branch should be similar to the basis module in BlendMask. But we do not upsample the feature maps from 8x to 4x here. I don't think these design choices of the mask branch are critical. 2) Yes.
@Yuxin-CV 1) The mask branch should be similar to the basis module in BlendMask. But we do not upsample the feature maps from 8x to 4x here. I don't think these design choices of the mask branch are critical. 2) Yes.
Thanks for your reply! Could you give me some suggestions for the issue mentioned in https://github.com/aim-uofa/AdelaiDet/issues/39#issuecomment-619559827?
@Yuxin-CV 1) The mask branch should be similar to the basis module in BlendMask. But we do not upsample the feature maps from 8x to 4x here. I don't think these design choices of the mask branch are critical. 2) Yes.
Hi~ @tianzhi0549, thanks for your reply. I wonder what kind of activation function, normalization layer & initialization method you use in the CondInst mask branch. It is not mentioned in the paper.
@Yuxin-CV We have released the code of BlendMask. CondInst is implemented with the same codebase. I think it should be helpful to you. Also, a hint is if the performance degradation is due to misalignment, you should see much more performance degradation on small objects than on large objects.
Thanks for your suggestions. @tianzhi0549 I mod my code and I get the improved results.
Resolution of Mask Prediction Box AP Mask AP APs APm APl 1 / 8 39.5 33.6 (-0.8) 14.7 (-0.4) 37.9 (0.5) 49.0 (-1.8) 1 / 4 39.5 34.7 (-1.0) 16.2 (-0.8) 38.3 (-1.0) 49.5 (-1.6) 1 / 2 39.5 (-0.2) 35.0 (-0.7) 17.0 (-0.1) 38.6 (-0.5) 49.3 (-0.9) It seems that now the bottleneck is not APs (misalignment) for the 1 / 2 case. For now,
- The performance gap in APl & APm is relatively large (0.9 & 0.5).
- There is still a 0.2 gap in Box AP.
Could you please give me some suggestions? I get low mask APs,Can you share the experiment log? [05/04 09:11:08 d2.evaluation.testing]: copypaste: 39.4628,58.8626,42.7325,23.9317,42.9279,50.3428 [05/04 09:11:08 d2.evaluation.testing]: copypaste: Task: segm [05/04 09:11:08 d2.evaluation.testing]: copypaste: AP,AP50,AP75,APs,APm,APl [05/04 09:11:08 d2.evaluation.testing]: copypaste: 32.8535,55.2280,33.6173,13.3537,36.2689,49.4594 [05/04 0
Hi~ @tianzhi0549 I want to make sure the shared head architecture of CondInst. Design A
Design B
Which one is right? I found Design B will degradation Box AP and mask AP is also very low. Here is my results for MS-R-50_1x.
The Box AP should be higher than 39.5 for MS training(~39.5) & multi-task training(+~1.0). So I think Design B is wrong. It is hard for one branch to handle 3 preds, and the grad from controller_pred degenerate the reg_pred.