Reproducing error from the paper and config file

Diagreen commented 10 months ago

Thank you for sharing your impressive work on object detection!

I wanted to know the score I would get with your work in my environment, so I did the following:

Model: STD with Oriented RCNN and HiViT-B Dataset: DOTA v1.0 Training option: As per the config file "rotated_imted_hb1m_oriented_rcnn_hivitdet_base_1x_dota_ms_rr_le90_stdc_xyawh321v.py" For image splitting:

Train set / Test set: Multi-scale / gap 500 / scales 0.5, 1.0, 1.5 Val set: Single-scale / gap 200 However, in the config file, the training data consists of 'trainval' (referred to as 'data_root_ms + trainval~~'). I created the Trainval set with the multi-scale option.

After 12 epochs, I ran a test task with the test set using the dist_test.sh file, but I couldn't achieve results similar to those in the paper. I got about an mAP of 28.7. Could there be a mistake in my training or testing approach?

Could you please tell me if there is a problem with my experiment?

yuhongtian17 commented 10 months ago

Thank you for your feedback. We roughly suggest that the training process seems to be no problem as there is still 28.7% mAP rather than 0 finally. However, more training details may need to be verified, and you can provided us with the training log file and the feedback email from DOTA official website after test-set result submission to further troubleshoot the issue for you.

Image split of DOTA dataset. When running the split plan provided by mmrotate-0.3.3/0.3.4, you will receive a message about "total images number" (please verify if your datasets are consistent with ours):

python tools/data/dota/split/img_split.py --base-json tools/data/dota/split/split_configs/ss_train.json
# Total images number: 15749
python tools/data/dota/split/img_split.py --base-json tools/data/dota/split/split_configs/ss_val.json
# Total images number: 5297
python tools/data/dota/split/img_split.py --base-json tools/data/dota/split/split_configs/ms_trainval.json
# Total images number: 138883
python tools/data/dota/split/img_split.py --base-json tools/data/dota/split/split_configs/ms_test.json
# Total images number: 71888

Loading of our pre-trained model. According to the config file, the pre-trained model should be "mae_hivit_base_dec512d8b_hifeat_p1600lr10.pth". All Transformer modules should be loaded correctly. You can calculate the MD5 code of the downloaded pth file to determine its integrity - it should be "8b7eb23d755b7e18a123bda282d74e32". You can also provide us with the hints of "unexpected key(s)" and "missing key(s)" when loading the pre-trained model.
In addition, the training log file and the feedback of DOTA official website after submission can be sent to my email address {yuhongtian17@mails.ucas.ac.cn} for more analyses. If the final mAP is only 28.7%, we believe that some abnormality can be obviously found in earlier epochs (for example, the interval mAP of the first epoch was far below 70%).

Diagreen commented 10 months ago

Thank you for your response. I got the simmilier mAP(28.7) when I use the official checkpoint and multi-scale test set with dist_test.sh file. So I tried to use single-scale test and I get the below score.

This is your evaluation result for task 1 (VOC metrics):

mAP: 0.8103967529522446 ap of each class: plane:0.8933291062571685, baseball-diamond:0.854736730497839, bridge:0.6086980723791174, ground-track-field:0.7715486101484151, small-vehicle:0.8194151305533786, large-vehicle:0.8621607229128045, ship:0.8852212356056705, tennis-court:0.9083021594186221, basketball-court:0.8694875621953049, storage-tank:0.8752260414632869, soccer-ball-field:0.6755570366758901, roundabout:0.7022241944847207, harbor:0.7906822317608828, swimming-pool:0.8431659911101292, helicopter:0.7961964688204357 COCO style result:

AP50: 0.8103967529522446 AP75: 0.5922508026699274 mAP: 0.5327391650331366

The train log is attached in below. when I try to get mAP with this weight file, I got mAP: 0.7812 with single-scale test file.

Maybe the gap between single/multi-scale test set makes the difference of mAP, but I got the bad score when I use multi one. I applied the installation procedure that this git page suggested, Is there any other way to apply multi-scale test set to get the score over 82?

Thank you

20240112_102524.txt

yuhongtian17 commented 10 months ago

We have read the log file you provided. The config file and pre-trained file of training have no problem; however, we found a big doubt: In our log file and the log files of other SOTA methods when using ms-trainval, the number of iterations in a single epoch is 8541 (8 GPUs, 1 sample per GPU, LSKNet) or 34163 (1 GPU, 2 samples per GPU, RVSA), while the number in your log file is 24830 (4 GPUs, 1 sample per GPU), which shows obvious inconsistent of the number of training images. We guess that your DOTA-v1.0 dataset (including train, val, test) has been polluted or the "ms_trainval.json"&"ms_test.json" when splitting the dataset are not the original ones given by MMRotate. Please check the "total images number" again, which is mentioned in our previous reply.

Some other doubts that may affect performance:

The initial learning rate should be scaled linearly with the total batchsize. For example, our total batchsize is 8 and the initial learning rate is 1e-4. If the total batchsize is set to 4, the initial learning rate should be scaled to 5e-5.
Many paths use double slashes ("//"), which may not look normal. e.g.

data_root_ms = 'dataset/DOTA//split_ms/'
data_root_ss = 'dataset/DOTA//split_ss/'

Diagreen commented 9 months ago

Thank you for your response again!

As you mentioned, It was a data pollution problem.

I re-new my DOTA dataset with mmrotate/tools/img_split.py. As you said, the new data is differ from old one. After I checked that the number of test dataset is same with the number that you announced me on above, I got the mAP 82.23.

I'm really impressive your great work again! Thank you!

yuhongtian17 / Spatial-Transform-Decoupling

Reproducing error from the paper and config file #1