Open facias914 opened 1 year ago
@facias914 It is suggested to use adamw for training vision transformer networks, and reproduce the method according to our parameter settings. The detailed settings can be seen in config files.
Thank you for your apply. I just used the checkpoint file provided by you. I compared the parameter list and it is consistent with yours. The specific content is above.
@facias914 I mean hyper parameters, such as optimizer, scheduler, and so on, please refer to this config: https://github.com/ViTAE-Transformer/Remote-Sensing-RVSA/blob/main/Object%20Detection/configs/obb/oriented_rcnn/vit_base_win/faster_rcnn_orpn_our_rsp_vit-base-win-rvsa_v3_wsz7_fpn_1x_dota10_lr1e-4_ldr75_dpr15.py
I know what you mean. But I just use the checkpoint file to inference, not to train.
Now I want to use the old mmrotate version to reproduce the code. But the old mmrotate need DOTA dataset's annotations to be pkl. Can you send me a DOTA datasets with pkl annotation? Thank you !
@facias914 which checkpoint you use? Did you use this: https://1drv.ms/u/s!AimBgYV7JjTlgVJM4Znng50US8KD?e=o4MRMQ ?
@facias914 The DOTA dataset needs to be clipped with BBoxTookit, then the pkl can be obtained.
@facias914 which checkpoint you use? Did you use this: https://1drv.ms/u/s!AimBgYV7JjTlgVJM4Znng50US8KD?e=o4MRMQ ?
Yes, I used this one
@facias914 The DOTA dataset needs to be clipped with BBoxTookit, then the pkl can be obtained.
Thank you!
@facias914 which checkpoint you use? Did you use this: https://1drv.ms/u/s!AimBgYV7JjTlgVJM4Znng50US8KD?e=o4MRMQ ?
Yes, I used this one
So the inference is conducted on unclipped images?
@facias914 which checkpoint you use? Did you use this: https://1drv.ms/u/s!AimBgYV7JjTlgVJM4Znng50US8KD?e=o4MRMQ ?
Yes, I used this one
So the inference is conducted on unclipped images?
No, I use clipped val images and clipped test images with size=1024 and gap = 200, obtaining mAP=68 and 54 respectively.
@facias914 OK, in fact, we didn't conduct the local validation. We directly train the model on the merged train+val set, and submit the results of testing set to the evaluation website. You can implement the same evaluation. The testing set also needs to be clipped with BBoxToolkit.
@facias914 OK, in fact, we didn't conduct the local validation. We directly train the model on the merged train+val set, and submit the results of testing set to the evaluation website. You can implement the same evaluation. The testing set also needs to be clipped with BBoxToolkit.
Thank You!I get the mAP87 in val dataset using the old mmrotate version. By the way , I get the test dataset pkl file , but I don't find the code to convert the pkl to txt. I used the pkl2txt code writen by myself , but failed to get the result in the DOTA website . Can you tell me where the pkl2txt file is?
@facias914 Shouldn't obbdetection automatically convert the pkl? (mmrotate is built based on obbdetection) https://mmrotate.readthedocs.io/en/latest/get_started.html#test-a-model
@facias914 Shouldn't obbdetection automatically convert the pkl? (mmrotate is built based on obbdetection) https://mmrotate.readthedocs.io/en/latest/get_started.html#test-a-model
Yes, I used obbdetection built from https://github.com/jbwang1997/OBBDetection, and I have got the pkl file. But the DOTA website requests txt file. So the pkl2txt code is what I need now.
@facias914 For DOTA-V1.0, using --format-only and OBBDetection will auto produce the required format, please refer to our readme.
@facias914 For DOTA-V1.0, using --format-only and OBBDetection will auto produce the required format, please refer to our readme.
Thank you very much for your reply. I also ran to 78.74 in the test set. Next, I will investigate why there is a big gap between the reproducible code and the official code.
Hi,
May I ask which config file in Bboxtoolkit you use to split DOTA dataset? is it ss_trainval.json?
{ "nproc": 20, "load_type": "dota", "img_dirs": [ "data/DOTA1_0/train/images/", "data/DOTA1_0/val/images/" ], "ann_dirs": [ "data/DOTA1_0/train/labelTxt/", "data/DOTA1_0/val/labelTxt/" ], "classes": null, "prior_annfile": null, "merge_type": "addition", "sizes": [ 1024 ], "gaps": [ 200 ], "rates": [ 1.0 ], "img_rate_thr": 0.6, "iof_thr": 0.7, "no_padding": false, "padding_value": [ 104, 116, 124 ], "filter_empty": true, "save_dir": "data/split_ss_dota1_0/trainval/", "save_ext": ".png" }
@facias914 For DOTA-V1.0, using --format-only and OBBDetection will auto produce the required format, please refer to our readme.
Thank you very much for your reply. I also ran to 78.74 in the test set. Next, I will investigate why there is a big gap between the reproducible code and the official code.
@facias914
Hello, I also encountered the same issue. I used the official ViTAE-B + RVSA model to infer the test dataset, but the mAP I obtained is only 0.394. I would like to ask for your advice on how you achieved a mAP similar to the authors'. Could you provide some help? I have shared some of my configuration information in this issue :https://github.com/ViTAE-Transformer/Remote-Sensing-RVSA/issues/39#issue-2417829111.
The original code version is too old, so I reproduced the code to the new mmrotate version. I loaded the weights you provided and it went fine, the result was an accuracy of 68 on the validation set and 54 on the test set.
I don't know where the problem is, the weight file is loaded smoothly, I have also checked the configuration file parameters, but I just don't know what went wrong.If there is a problem with my reproduction, the final result should be 0, and it should not be as high as 54