Closed tianzhi0549 closed 4 years ago
@tianzhi0549 Thanks for pointing out it, i will try and update the new result later.
@Epiphqny I also note that it seems you are using absolute coordinates as the input to the mask heads, which is not correct. It is important to use relative coordinates here because we hope the generated filters are position-independent.
@tianzhi0549 The coordinates in this implementation is ranged from -1 to 1, what do you mean by relative coordinates, should it be 0-1 instead?
@Epiphqny https://github.com/aim-uofa/AdelaiDet/issues/10. You can refer to the explanation here.
@tianzhi0549 Ok, i will try that.
@tianzhi0549 It sounds that the relative coordinates is in some way like the center-ness...but implements in different approach, just my opinion.
@tianzhi0549 They may be similar in some aspects, but they are designed for totally different purposes ...
@tianzhi0549 Yes, both are interesting ideas!
@tianzhi0549 Hi, i replaced the original upsample with the aligned version and used the upsampled mask to calculate loss, now the AP is 37.1. But this is still the absolute coordinate version, i will update new results after the training of relative coordinate version finished.
@Epiphqny Great! For the memory usage issue, you could limit the maximum number of samples used to compute masks during training. Using relative coordinates might also much boost the performance.
@tianzhi0549 Perhaps there is some problem in my implementation of relative coordinates, it only achieves 36.9 mAP, which is worse than the absolute coordinate version.
@Epiphqny if possible, you can push your code to a new branch of the repo. I can help check it.
Hi @tianzhi0549, i have add the code in the relative_coordinate branch, thank you very much for the help!
@Epiphqny Are you sure this line is correct? https://github.com/Epiphqny/CondInst/blob/1b03b70ea6c71f0e951ed2771ad16a24515d4c3c/fcos/modeling/fcos/fcos_outputs.py#L591
@Epiphqny Hi~Thanks for sharing your code!
It seems that the setting of IMS_PER_BATCH
and BASE_LR
in your config
is incorrect.
https://github.com/Epiphqny/CondInst/blob/ea3f717fce73a8e4c273f1379c9d9c3550387e1b/configs/CondInst/Base-FCOS.yaml#L17-L18
IMS_PER_BATCH
and BASE_LR
should be changed together according to Linear Scaling Rule, you need to set the learning rate proportional to the batch size if you use different GPUs or images per GPU, e.g., IMS_PER_BATCH = 4
& BASE_LR = 0.0025
.
I also find the similar problem in your YOLACT_FCOS repo: https://github.com/Epiphqny/Yolact_fcos/blob/b131542a930499523343d3fd660088e7e372c317/configs/Yolact/Base-FCOS.yaml#L16-L18
Though changing IMS_PER_BATCH
and BASE_LR
according to Linear Scaling Rule cannot guarantee to reproduce the results in the paper, but I think it can help you obtain very close result. @tianzhi0549 @Epiphqny
@Yuxin-CV Thank you very much for pointing out that, i will try the Linear Scaling Rule later.
@tianzhi0549 sorry, can not find the problem in this line, could you point out it directly?
@Epiphqny I would suggest that you compute all the coordinate transformation on the scale of the input image. After you get the final relative coordinates, you can normalize them by a constant scale. Please make sure even after normalization, the locations generating the filters should always be at (0, 0).
@tianzhi0549 I have subtracted the center coordinate in https://github.com/Epiphqny/CondInst/blob/1b03b70ea6c71f0e951ed2771ad16a24515d4c3c/fcos/modeling/fcos/fcos_outputs.py#L600 , and the values of center locations are zero.
@Yuxin-CV Thank you very much for pointing out that, i will try the Linear Scaling Rule later.
Personally, I think you can try R-50 1x lr_schedule with input_size = 800, batch_size = 16 first before using stronger backbone and longer lr_schedule. You can get the results in less than 1 day if you have access to 4 or 8 GPU. Looking forward to your result! @Epiphqny
BTW, I wonder how you @tianzhi0549 implement the forward_mask()
part in the official code?
Do you simplely use a for loop just like @Epiphqny's implementation:
https://github.com/Epiphqny/CondInst/blob/ea3f717fce73a8e4c273f1379c9d9c3550387e1b/fcos/modeling/fcos/fcos_outputs.py#L585-L607
or some other highly optimized implementation, e.g., a CUDA kernel?
Hi~@Epiphqny
I also find that the mask_loss
's normalization factor N_pos
in your code is not reduced.
https://github.com/Epiphqny/CondInst/blob/4a519c12b7be83f86b3d75c62cf3a87a9dec31a7/fcos/modeling/fcos/fcos_outputs.py#L581-L582
I think it is better to use num_pos_avg
as the normalization factor, which is the average of all the positive samples across different GPUs.
https://github.com/Epiphqny/CondInst/blob/4a519c12b7be83f86b3d75c62cf3a87a9dec31a7/fcos/modeling/fcos/fcos_outputs.py#L504-L508
mask_loss = mask_loss / num_pos_avg
@tianzhi0549 I have subtracted the center coordinate in
, and the values of center locations are zero.
@Epiphqny I think the rel. coord. should be location specific, just like:
For location (x, y) on input_img:
x_range = torch.arange(W_mask)
y_range = torch.arange(H_mask)
y_grid, x_grid = torch.grid(y_range, x_range)
y_rel_coord = (y_grid – y / mask_stride).normalize_to(-1, 1)
x_rel_coord = (x_grid – x / mask_stride).normalize_to(-1, 1)
rel_coord = torch.cat(x_rel_coord, y_rel_coord)
@tianzhi0549 Am I right? Could you provide the official code snippet of rel. coord.? Thanks!
@Yuxin-CV Please modify the code and train the model, then report the result here. I will update if there is improvement. I don't have extra GPU to train the model now.
@Yuxin-CV Please modify the code and train the model, then report the result here. I will update if there is improvement. I don't have extra GPU to train the model now.
OK
@Epiphqny For your information. https://github.com/aim-uofa/AdelaiDet/issues/23#issuecomment-611870073. Thank you:-).
@tianzhi0549 Ok, thanks for providing the code.
@tianzhi0549 I got the same result in your docker using "aligned_bilinear" and "F.interpolate" !
@tianzhi0549 One question about aligned_bilinear: I noticed that other interpolation operations in detectron2 and adet required align_corners =False (e.g. image and mask resize). Should we change other align_corners to True when using CondInst? Thanks.
https://github.com/Epiphqny/CondInst/blob/4a519c12b7be83f86b3d75c62cf3a87a9dec31a7/fcos/modeling/fcos/fcos_outputs.py#L366 The default bilinear in PyTorch is not aligned, which might much degrade the performance, in particular for small objects.
Please try the aligned bilinear.