Open KimSoybean opened 6 years ago
Hi @bowenc0221 ,
Can you please give me some ideas about how to process the different size ROI? Thz
Thank you! I want to apply DCR-v1 on one-stage detector. Do you have some ideas about that?
Hi @hongdayu ,
ROIs are first cropped on the original image and resized to 224x224. You may find this code helpful.
Hi @KimSoybean ,
Since one-stage detectors produces much more boxes than two-stage detectors (typically 300 for Faster RCNN), you may need decide a trade-off on how many boxes you want to process and how to select these boxes.
Thank you!@bowenc0221
Hi @hongdayu ,
ROIs are first cropped on the original image and resized to 224x224. You may find this code helpful.
Thank you !!!
In sampling strategies, some pad_indexes are randomly sampled in all boxes or null ( pad_indexes=[] ). This causes there are same samples in positive and negative samples. Do i misunderstand that?
@KimSoybean The padding is needed to form a fixed-size batch. For example, if batchsize=32 but we only sampled 30 boxes, then we just pad it to 32 boxes and assign label -1 (ignored during training) for the padded boxes.
I see. Thank you. @bowenc0221
Why you multiply two scores? This makes the original score become lower or do I misunderstand that?
@KimSoybean
The reason is that the new classifier (DCR) is trained without location information. If we only use the scores from DCR, we observe very poor performance (DCR becomes RCNN in this case, due to different sampling strategy, it performs worse than RCNN when used along).
Yes, multiplying two scores will make the final score become lower. However, it is a reranking process. DCR decreases scores of False Positives (FPs) by a larger amount than True Positives (TPs). This makes the relative ranking changes and more FPs are suppressed by a predefined threshold. In the final evaluation process, only the relative ranking matters.
Thanks! @bowenc0221. Have you trained DCR for more than 9 epochs ? Will the over-fitting happen? I want to know more details because I'm using DCR for one-stage face detection.
@KimSoybean I haven't trained it for more epochs. I simply trained the same number of epochs as used for training Faster RCNN and it might not be the optimal setting. However, I think overfitting might happen if you do not use any data augmentation. (one-stage detectors are trained for longer epochs because they have strong data augmentation)
Hi, I got another question for u~? how do you assign labels for the images handing to stage2 DCR model? (v1) is that exactly same as the standard of stage1?
DCR-v1 is a stand-alone classification network aiming to suppress (hard) false positives in object detection. You can think DCR-v1 as a classifier and we use ResNet-152 in our paper. The input to DCR-v1 is a batch of images with size 3x224x224. Each image is a cropped proposal of base detector's output.
./dcr_v1/train_rcnn.py is used to train DCR-v1.
./dcr_v1/rcnn_rescore_combined_fast.py is used to combine DCR-v1 results with base detector's classification results (by simply multiplying two scores).