Open GHPAUN opened 4 years ago
Hello sir, I'm reproducing detectoRS htc x101 performance on coco. With 1 img per gpu on 16 gpus, which is the recommended setting for htc x101, I'm getting 51.4mAP; with 1 img per gpu on 32 gpus, I'm getting 52.4mAP, which is pretty close to the off-the-shelf model, but still, there's some gap. There might be some details I didn't notice. Also, the training is very time and gpu memory consuming, have you tried mmdet fp16 train and how is the performance? Thanks a lot.
Hi, I want to take this opportunity to ask a question: what gpu did you use for training? Because even r50 with image scale (200,200) will cause OOM on 1080Ti. Is there any suggestion?
@GHPAUN please first try reproducing smaller models as in the branch mmdetv2. I haven't used FP16 and am not sure whether DetectoRS can be trained with it or not.
@we1pingyu We used TITAN RTX to train all models. If you have memory issue, please try cascade-based DetectoRS which requires less memory and set with_cp to True for all the backbones.
@joe-siyuan-qiao thank you for your reply. I have already reproduced the smaller models, aka detecors htc r50 12e 49.1mAP, and 51.4 for 40epochs. I got the 52.9mAP using 64 gpus and 0.04lr just now, with tta I got 54.1mAP on val set. I'll try test set later. I guess it's quicker and more stable using larger batch size and larger lr. @we1pingyu I used v100
@GHPAUN 52.9 mAP on val2017 matches the performance of the provided pre-trained model. mAP's on test-dev are usually better than those on val2017. Just out of curiosity, why did you use 0.04 lr when the batch size is 64? I thought it would be 0.08 to match the linearity.
I'm trying to get 53.3 on pretrained model from repo for COCO dataset. But my result for val2017 only 51.5. Am I missing something?
Here is my code:
from mmdet.apis import init_detector, inference_detector, show_result_pyplot
import mmcv
import glob
import os
if __name__ == '__main__':
config_file = 'DetectoRS_mstrain_400_1200_x101_32x4d_40e.py'
checkpoint_file = 'DetectoRS_X101-ed983634.pth'
model = init_detector(config_file, checkpoint_file, device='cuda:0')
out = open('coco_DetectoRS_val2017_preds.csv', 'w')
out.write('img_id,label,score,x1,x2,y1,y2\n')
images = glob.glob('val2017/*.jpg')
for img in images:
id = os.path.basename(img)
res = inference_detector(model, img)
for i in range(80):
for j in range(len(res[0][i])):
x1, y1, x2, y2, prob = res[0][i][j]
out.write('{},{},{},{:.2f},{:.2f},{:.2f},{:.2f}\n'.format(id, i, prob, x1, x2, y1, y2))
out.close()
Then I run pycocotools and here is the result:
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.515
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.710
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.564
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.318
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.565
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.676
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.384
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.628
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.671
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.479
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.723
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.828
@ZFTurbo We used tools/dist_test.py to evaluate the pre-trained model. You can take a look at it to find out what the differences are. Thanks.
@joe-siyuan-qiao yeah I thought that too, but in my experiments lr = 0.00125 batch_size is always worse than lr = 0.000625 batch_size, maybe I missed something here.
Hello sir, I'm reproducing detectoRS htc x101 performance on coco. With 1 img per gpu on 16 gpus, which is the recommended setting for htc x101, I'm getting 51.4mAP; with 1 img per gpu on 32 gpus, I'm getting 52.4mAP, which is pretty close to the off-the-shelf model, but still, there's some gap. There might be some details I didn't notice. Also, the training is very time and gpu memory consuming, have you tried mmdet fp16 train and how is the performance? Thanks a lot.