Train with dota-train-dataset（1024，14384files），the mAP on dota-val-dataset is 70.84

hukaixuan19970627 commented 3 years ago

Thank you for your code, I'm learning how to use it, but I've had some problems and hope to get your help. config: orientedreppoints_r50_demo.py changes: img_per_gpu=2 -> img_per_gpu=4 workers_per_gpu=2 -> workers_per_gpu=4 lr=0.01 -> lr=0.005 environment: 2 gpu(Tesla P40) about mAP on val: 70.84. classaps:[89.43 73.79 40.19 66.33 73.53 82.06 88.16 90.86 60.59 86.46 65.51 64.86 71.29 57.60 51.94 ] my question: I use your checkpoints（form trainval-dataset） to detect dota-val-dataset and the mAP is about 82. But the mAP 70.84(checkpoints form train-dota-dataset, test on val) feels lower than I expected(73 ~ 75). Is this normal?

LiWentomng commented 3 years ago

For training on the train dataset，evaluation on the val dataset. My results can gain the mAP：73.37447 class APs: [89.89954584 75.09381718 51.91760568 69.30359075 75.60788996 82.47240929 88.02548317 90.72148874 66.22466264 87.10500443 69.58421786 68.80032583 72.45845151 61.51307246 51.88949827]. My trained model is here (password: aabb). You can try it.

I guess that your results are resulted by these three aspects:

My train set include 15749 files, subsize=1024 x1024, gap=200. The number of your files is less than it. My script is prepare_dota1_train_val.py to prepare the train and val dataset, and you can refer to it.

The learning rate is a sensitive factor for the model training. My device environment is as follow: 8 RTX2080ti, 2 imgs per gpu.
You can try the learning rate of 0.006, 0.008.
You can also add the “RandomRotate”in the config to get a better mAP, as following： dict(type='RandomRotate', rate=0.5, angles=[30, 60, 90, 120, 150], auto_bound=False)

If you have any questions for this problem, please let me know. I'll try to help you to get the normal results.

LiWentomng commented 3 years ago

@hukaixuan19970627

hukaixuan19970627 commented 3 years ago

Yeah，the learning rate does have a significant impact on results. I got the mAP65 when the environment is 2 Tesla P40，4 imgs per gpu，lr=0.01(train on train-dota-dataset, test on val-dota-dataset). My train-dota-dataset include 14384files（subsize=1024×1024， gap=100），maybe that's what makes the difference in results.

hukaixuan19970627 commented 3 years ago

Have you tried mixed precision training? I add ‘fp16 = dict(loss_scale=512.)’ to the config file, but the mAP is just 4.78. btw: The mAP is 74.98 with same config file, FP32 training.

LiWentomng commented 3 years ago

I haven't tried the mixed precision training to train this model. As far as I know, Tesla P40 may not support FP16. Besides, with a supportable GPU, the loss_scale=512 is used to adjust the magnification scale of loss and gradient during the training. The appropriate range is 0-1000. I guess that the model parameters have not been updated because the gradient is too small with fp16. Maybe a larger value will get a better result.

LiWentomng / OrientedRepPoints

Train with dota-train-dataset（1024，14384files），the mAP on dota-val-dataset is 70.84 #4