kfiou with swin_tiny reimplementation problems

xiaoyihit commented 2 years ago

Reimplement a model in the model zoo using the provided configs

Checklist

I have searched related issues but cannot get the expected help.
The issue has not been fixed in the latest version.

Describe the issue

Reimplement a model in the model zoo using the provided configs configs/kfiou/r3det_kfiou_ln_swin_tiny_adamw_fpn_1x_dota_ms_rr_oc.py configs/kfiou/roi_trans_kfiou_ln_swin_tiny_fpn_1x_dota_le90.py

Reproduction

What command or script did you run?

tools/train.py

What config dir you run?

configs/kfiou/r3det_kfiou_ln_swin_tiny_adamw_fpn_1x_dota_ms_rr_oc.py configs/kfiou/roi_trans_kfiou_ln_swin_tiny_fpn_1x_dota_le90.py

Did you make any modifications on the code or config? Did you understand what you have modified?

I loaded swintransformer model myself since the site given is invalid.

What dataset did you use?

dotav1

Environment

Please run python mmrotate/utils/collect_env.py to collect necessary environment information and paste it here.

sys.platform: linux Python: 3.7.11 (default, Jul 27 2021, 14:32:16) [GCC 7.5.0] CUDA available: True GPU 0,1: GeForce RTX 3090 CUDA_HOME: /usr/local/cuda NVCC: Build cuda_11.1.TC455_06.29190527_0 GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 PyTorch: 1.10.0 PyTorch compiling details: PyTorch built with:

GCC 7.3
C++ Version: 201402
Intel(R) oneAPI Math Kernel Library Version 2021.4-Product Build 20210904 for Intel(R) 64 architecture applications
Intel(R) MKL-DNN v2.2.3 (Git Hash 7336ca9f055cf1bfa13efb658fe15dc9b41f0740)
OpenMP 201511 (a.k.a. OpenMP 4.5)
LAPACK is enabled (usually provided by MKL)
NNPACK is enabled
CPU capability usage: AVX512
CUDA Runtime 11.1
NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=compute_37
CuDNN 8.0.5
Magma 2.5.2
Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.1, CUDNN_VERSION=8.0.5, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.10.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON,

TorchVision: 0.11.1 OpenCV: 4.5.4 MMCV: 1.4.8 MMCV Compiler: GCC 7.3 MMCV CUDA Compiler: 11.1 MMRotate: 0.1.0+

Results

The result is strange for both configs. for r3det_kfiou_ln_swin_tiny_adamw_fpn_1x_dota_ms_rr_oc, the loss goes nan with the given config in the middle of training process. The output is as follow: 2022-03-03 03:13:07,337 - mmrotate - INFO - Exp name: r3det_kfiou_ln_swin_tiny_adamw_fpn_1x_dota_ms_rr_oc.py 2022-03-03 03:13:07,337 - mmrotate - INFO - Epoch [4][3800/6400] lr: 1.000e-04, eta: 15:13:00, time: 1.222, data_time: 0.009, memory: 6728, s0.loss_cls: 0.8579, s0.loss_bbox: 49.3017, sr0.loss_cls: 0.5324, sr0.loss_bbox: 12.8906, loss: 63.5826, grad_norm: 238.8943 2022-03-03 03:14:09,169 - mmrotate - INFO - Epoch [4][3850/6400] lr: 1.000e-04, eta: 15:12:34, time: 1.237, data_time: 0.009, memory: 6728, s0.loss_cls: 0.9198, s0.loss_bbox: 45.2929, sr0.loss_cls: 0.5530, sr0.loss_bbox: 24.8593, loss: 71.6250, grad_norm: 303.1804 2022-03-03 03:15:09,412 - mmrotate - INFO - Epoch [4][3900/6400] lr: 1.000e-04, eta: 15:12:05, time: 1.205, data_time: 0.009, memory: 6728, s0.loss_cls: 0.9237, s0.loss_bbox: 38.0998, sr0.loss_cls: 0.5913, sr0.loss_bbox: 22.2580, loss: 61.8729, grad_norm: 258.2369 2022-03-03 03:16:09,798 - mmrotate - INFO - Epoch [4][3950/6400] lr: 1.000e-04, eta: 15:11:36, time: 1.208, data_time: 0.009, memory: 6728, s0.loss_cls: 0.8610, s0.loss_bbox: 51.9997, sr0.loss_cls: 0.4727, sr0.loss_bbox: 8.2045, loss: 61.5379, grad_norm: 245.4183 2022-03-03 03:17:10,993 - mmrotate - INFO - Epoch [4][4000/6400] lr: 1.000e-04, eta: 15:11:09, time: 1.224, data_time: 0.010, memory: 6728, s0.loss_cls: 0.9141, s0.loss_bbox: 43.9352, sr0.loss_cls: 0.4976, sr0.loss_bbox: 17.3902, loss: 62.7370, grad_norm: 251.7698 2022-03-03 03:18:10,932 - mmrotate - INFO - Epoch [4][4050/6400] lr: 1.000e-04, eta: 15:10:38, time: 1.199, data_time: 0.010, memory: 6728, s0.loss_cls: 0.8810, s0.loss_bbox: 46.5427, sr0.loss_cls: 0.5061, sr0.loss_bbox: 17.8371, loss: 65.7669, grad_norm: 223.0147 2022-03-03 03:19:10,435 - mmrotate - INFO - Epoch [4][4100/6400] lr: 1.000e-04, eta: 15:10:07, time: 1.190, data_time: 0.009, memory: 6728, s0.loss_cls: 0.9800, s0.loss_bbox: 38.0009, sr0.loss_cls: 0.4385, sr0.loss_bbox: 13.2763, loss: 52.6957, grad_norm: 182.5411 2022-03-03 03:20:11,712 - mmrotate - INFO - Epoch [4][4150/6400] lr: 1.000e-04, eta: 15:09:39, time: 1.225, data_time: 0.010, memory: 6728, s0.loss_cls: 0.8986, s0.loss_bbox: 45.8436, sr0.loss_cls: 0.4950, sr0.loss_bbox: 14.5906, loss: 61.8278, grad_norm: 248.6563 2022-03-03 03:21:10,936 - mmrotate - INFO - Epoch [4][4200/6400] lr: 1.000e-04, eta: 15:09:07, time: 1.184, data_time: 0.009, memory: 6728, s0.loss_cls: 0.8558, s0.loss_bbox: 44.0947, sr0.loss_cls: 0.5000, sr0.loss_bbox: 17.9017, loss: 63.3523, grad_norm: 211.6122 2022-03-03 03:22:10,696 - mmrotate - INFO - Epoch [4][4250/6400] lr: 1.000e-04, eta: 15:08:35, time: 1.195, data_time: 0.010, memory: 6728, s0.loss_cls: 0.8331, s0.loss_bbox: 39.5684, sr0.loss_cls: 0.6009, sr0.loss_bbox: 19.1346, loss: 60.1370, grad_norm: 207.3286 2022-03-03 03:23:11,753 - mmrotate - INFO - Epoch [4][4300/6400] lr: 1.000e-04, eta: 15:08:07, time: 1.221, data_time: 0.010, memory: 6728, s0.loss_cls: 0.8634, s0.loss_bbox: 42.2088, sr0.loss_cls: 0.6487, sr0.loss_bbox: 30.8503, loss: 74.5712, grad_norm: 228.5747 2022-03-03 03:24:08,064 - mmrotate - INFO - Epoch [4][4350/6400] lr: 1.000e-04, eta: 15:07:28, time: 1.126, data_time: 0.009, memory: 6728, s0.loss_cls: nan, s0.loss_bbox: nan, sr0.loss_cls: nan, sr0.loss_bbox: nan, loss: nan, grad_norm: nan 2022-03-03 03:25:01,703 - mmrotate - INFO - Epoch [4][4400/6400] lr: 1.000e-04, eta: 15:06:42, time: 1.073, data_time: 0.009, memory: 6728, s0.loss_cls: nan, s0.loss_bbox: nan, sr0.loss_cls: nan, sr0.loss_bbox: nan, loss: nan, grad_norm: nan

For roi_trans_kfiou_ln_swin_tiny_fpn_1x_dota_le90, things are strange again. The loss seems normal, however the mAP is close to zero.

2022-03-03 10:36:05,468 - mmrotate - INFO - Epoch [12][5950/6400] lr: 1.000e-06, eta: 0:03:56, time: 0.307, data_time: 0.007, memory: 7279, loss_rpn_cls: 0.0862, loss_rpn_bbox: 0.0175, s0.loss_cls: 0.1440, s0.acc: 95.4902, s0.loss_bbox: 0.1616, s1.loss_cls: 0.1116, s1.acc: 96.6328, s1.loss_bbox: 0.0469, loss: 0.5677, grad_norm: 2.8243 2022-03-03 10:36:20,704 - mmrotate - INFO - Epoch [12][6000/6400] lr: 1.000e-06, eta: 0:03:29, time: 0.305, data_time: 0.007, memory: 7279, loss_rpn_cls: 0.0784, loss_rpn_bbox: 0.0115, s0.loss_cls: 0.1379, s0.acc: 95.9961, s0.loss_bbox: 0.1559, s1.loss_cls: 0.1037, s1.acc: 97.1230, s1.loss_bbox: 0.0312, loss: 0.5187, grad_norm: 2.3542 2022-03-03 10:36:35,877 - mmrotate - INFO - Epoch [12][6050/6400] lr: 1.000e-06, eta: 0:03:03, time: 0.303, data_time: 0.007, memory: 7279, loss_rpn_cls: 0.1008, loss_rpn_bbox: 0.0221, s0.loss_cls: 0.1560, s0.acc: 94.7656, s0.loss_bbox: 0.1852, s1.loss_cls: 0.1271, s1.acc: 95.7930, s1.loss_bbox: 0.0425, loss: 0.6336, grad_norm: 2.9375 2022-03-03 10:36:51,185 - mmrotate - INFO - Epoch [12][6100/6400] lr: 1.000e-06, eta: 0:02:37, time: 0.306, data_time: 0.007, memory: 7279, loss_rpn_cls: 0.1049, loss_rpn_bbox: 0.0182, s0.loss_cls: 0.1789, s0.acc: 94.1738, s0.loss_bbox: 0.2147, s1.loss_cls: 0.1394, s1.acc: 95.6074, s1.loss_bbox: 0.0549, loss: 0.7109, grad_norm: 3.0274 2022-03-03 10:37:06,526 - mmrotate - INFO - Epoch [12][6150/6400] lr: 1.000e-06, eta: 0:02:11, time: 0.307, data_time: 0.007, memory: 7279, loss_rpn_cls: 0.1238, loss_rpn_bbox: 0.0199, s0.loss_cls: 0.1759, s0.acc: 94.2949, s0.loss_bbox: 0.1875, s1.loss_cls: 0.1410, s1.acc: 95.5137, s1.loss_bbox: 0.0447, loss: 0.6929, grad_norm: 3.0797 2022-03-03 10:37:21,769 - mmrotate - INFO - Epoch [12][6200/6400] lr: 1.000e-06, eta: 0:01:44, time: 0.305, data_time: 0.007, memory: 7279, loss_rpn_cls: 0.0888, loss_rpn_bbox: 0.0145, s0.loss_cls: 0.1383, s0.acc: 95.7617, s0.loss_bbox: 0.1513, s1.loss_cls: 0.1136, s1.acc: 96.6421, s1.loss_bbox: 0.0477, loss: 0.5542, grad_norm: 2.6510 2022-03-03 10:37:37,021 - mmrotate - INFO - Epoch [12][6250/6400] lr: 1.000e-06, eta: 0:01:18, time: 0.305, data_time: 0.007, memory: 7279, loss_rpn_cls: 0.1119, loss_rpn_bbox: 0.0170, s0.loss_cls: 0.1698, s0.acc: 95.0332, s0.loss_bbox: 0.1738, s1.loss_cls: 0.1317, s1.acc: 96.1934, s1.loss_bbox: 0.0454, loss: 0.6497, grad_norm: 2.8886 2022-03-03 10:37:52,335 - mmrotate - INFO - Epoch [12][6300/6400] lr: 1.000e-06, eta: 0:00:52, time: 0.306, data_time: 0.007, memory: 7279, loss_rpn_cls: 0.1384, loss_rpn_bbox: 0.0296, s0.loss_cls: 0.1954, s0.acc: 93.4336, s0.loss_bbox: 0.2464, s1.loss_cls: 0.1495, s1.acc: 95.3066, s1.loss_bbox: 0.0577, loss: 0.8170, grad_norm: 3.6254 2022-03-03 10:38:07,793 - mmrotate - INFO - Epoch [12][6350/6400] lr: 1.000e-06, eta: 0:00:26, time: 0.309, data_time: 0.007, memory: 7279, loss_rpn_cls: 0.1259, loss_rpn_bbox: 0.0295, s0.loss_cls: 0.1714, s0.acc: 94.6934, s0.loss_bbox: 0.1900, s1.loss_cls: 0.1361, s1.acc: 96.0684, s1.loss_bbox: 0.0520, loss: 0.7049, grad_norm: 3.2369 2022-03-03 10:38:23,037 - mmrotate - INFO - Exp name: roi_trans_kfiou_ln_swin_tiny_fpn_1x_dota_le90.py 2022-03-03 10:38:23,037 - mmrotate - INFO - Epoch [12][6400/6400] lr: 1.000e-06, eta: 0:00:00, time: 0.305, data_time: 0.007, memory: 7279, loss_rpn_cls: 0.1173, loss_rpn_bbox: 0.0252, s0.loss_cls: 0.1731, s0.acc: 94.5508, s0.loss_bbox: 0.1745, s1.loss_cls: 0.1519, s1.acc: 95.3516, s1.loss_bbox: 0.0641, loss: 0.7061, grad_norm: 3.0836 2022-03-03 10:38:23,234 - mmrotate - INFO - Saving checkpoint at 12 epochs 2022-03-03 10:52:21,340 - mmrotate - INFO - +--------------------+-------+-------+--------+-------+ | class | gts | dets | recall | ap | +--------------------+-------+-------+--------+-------+ | plane | 18788 | 38005 | 0.062 | 0.013 | | baseball-diamond | 1087 | 401 | 0.016 | 0.002 | | bridge | 4181 | 504 | 0.000 | 0.000 | | ground-track-field | 733 | 374 | 0.023 | 0.003 | | small-vehicle | 58868 | 69520 | 0.014 | 0.000 | | large-vehicle | 43075 | 65885 | 0.010 | 0.000 | | ship | 76153 | 73019 | 0.013 | 0.000 | | tennis-court | 5923 | 10750 | 0.075 | 0.007 | | basketball-court | 1180 | 751 | 0.024 | 0.002 | | storage-tank | 13670 | 19423 | 0.017 | 0.001 | | soccer-ball-field | 827 | 736 | 0.018 | 0.001 | | roundabout | 973 | 190 | 0.006 | 0.000 | | harbor | 15468 | 28043 | 0.010 | 0.000 | | swimming-pool | 3836 | 4222 | 0.008 | 0.000 | | helicopter | 1189 | 502 | 0.024 | 0.005 | +--------------------+-------+-------+--------+-------+ | mAP | | | | 0.002 | +--------------------+-------+-------+--------+-------+ 2022-03-03 10:52:21,404 - mmrotate - INFO - Exp name: roi_trans_kfiou_ln_swin_tiny_fpn_1x_dota_le90.py 2022-03-03 10:52:21,405 - mmrotate - INFO - Epoch(val) [12][12800] mAP: 0.0024

yangxue0827 commented 2 years ago

Did you successfully reimplement the base config: r3det_kfiou_ln_r50_fpn_1x_dota_oc.py?

xiaoyihit commented 2 years ago

Yeah, although I only get mAP 72.06

yangxue0827 commented 2 years ago

There is a typo in r3det_kfiou_ln_swin_tiny_adamw_fpn_1x_dota_ms_rr_oc.py, use angle_version = 'oc' instead of angle_version = 'le90'

xiaoyihit commented 2 years ago

Well, r3det_kfiou_ln_swin_tiny_adamw_fpn_2x_dota_ms_rr_oc is based on r3det_kfiou_ln_swin_tiny_adamw_fpn_1x_dota_ms_rr_oc, so I dont see any difference? Also, is it using angle_version = 'le90' instead of angle_version = 'oc' brings me such problems?

xiaoyihit commented 2 years ago

I did tried that btw, 2022-03-25 00:57:20,760 - mmrotate - INFO - Exp name: r3det_kfio 2022-03-25 00:57:20,761 - mmrotate - INFO - Epoch [3][2200/6400] 2022-03-25 00:58:31,609 - mmrotate - INFO - Epoch [3][2250/6400] 2022-03-25 00:59:41,344 - mmrotate - INFO - Epoch [3][2300/6400] 2022-03-25 01:00:51,674 - mmrotate - INFO - Epoch [3][2350/6400] 2022-03-25 01:02:00,876 - mmrotate - INFO - Epoch [3][2400/6400] 2022-03-25 01:03:11,688 - mmrotate - INFO - Epoch [3][2450/6400] 2022-03-25 01:04:19,729 - mmrotate - INFO - Epoch [3][2500/6400] 2022-03-25 01:05:30,525 - mmrotate - INFO - Epoch [3][2550/6400] 2022-03-25 01:06:40,029 - mmrotate - INFO - Epoch [3][2600/6400] 2022-03-25 01:07:49,040 - mmrotate - INFO - Epoch [3][2650/6400] 2022-03-25 01:08:56,614 - mmrotate - INFO - Epoch [3][2700/6400] 2022-03-25 01:10:02,129 - mmrotate - INFO - Epoch [3][2750/6400] 2022-03-25 01:11:09,186 - mmrotate - INFO - Epoch [3][2800/6400] 2022-03-25 01:12:15,525 - mmrotate - INFO - Epoch [3][2850/6400] 2022-03-25 01:13:19,685 - mmrotate - INFO - Epoch [3][2900/6400] 2022-03-25 01:14:26,097 - mmrotate - INFO - Epoch [3][2950/6400] 2022-03-25 01:15:32,833 - mmrotate - INFO - Epoch [3][3000/6400] 2022-03-25 01:16:39,540 - mmrotate - INFO - Epoch [3][3050/6400] 2022-03-25 01:17:43,395 - mmrotate - INFO - Epoch [3][3100/6400] u_ln_swin_tiny_adamw_fpn_2x_dota_ms_rr_oc.py lr: 1.000e-04, eta: 2 days, 2:44:45, time: 1.375, data_time: 0.147, memory: 6889, s0.loss_cls: 0.8992, s0.loss_bbox: 46.6267, sr0.loss_cls: 0.5464, sr0.loss_bbox: 6.5478, loss: 54.6200, grad_norm: 254.9667 lr: 1.000e-04, eta: 2 days, 2:44:25, time: 1.417, data_time: 0.152, memory: 6889, s0.loss_cls: 0.8991, s0.loss_bbox: 52.8903, sr0.loss_cls: 0.5891, sr0.loss_bbox: 6.5397, loss: 60.9181, grad_norm: 234.1258 lr: 1.000e-04, eta: 2 days, 2:43:54, time: 1.395, data_time: 0.153, memory: 6889, s0.loss_cls: 0.9001, s0.loss_bbox: 43.9201, sr0.loss_cls: 0.6084, sr0.loss_bbox: 9.1293, loss: 54.5579, grad_norm: 225.4924 lr: 1.000e-04, eta: 2 days, 2:43:28, time: 1.407, data_time: 0.170, memory: 6889, s0.loss_cls: 0.8645, s0.loss_bbox: 40.6765, sr0.loss_cls: 0.4212, sr0.loss_bbox: 8.9599, loss: 50.9220, grad_norm: 237.7153 lr: 1.000e-04, eta: 2 days, 2:42:52, time: 1.384, data_time: 0.149, memory: 6889, s0.loss_cls: 0.9447, s0.loss_bbox: 48.2193, sr0.loss_cls: 0.6202, sr0.loss_bbox: 8.7886, loss: 58.5729, grad_norm: 241.3439 lr: 1.000e-04, eta: 2 days, 2:42:30, time: 1.416, data_time: 0.161, memory: 6889, s0.loss_cls: 0.9231, s0.loss_bbox: 54.3726, sr0.loss_cls: 0.6410, sr0.loss_bbox: 6.9682, loss: 62.9048, grad_norm: 244.5289 lr: 1.000e-04, eta: 2 days, 2:41:43, time: 1.361, data_time: 0.147, memory: 6889, s0.loss_cls: 0.8697, s0.loss_bbox: 39.4403, sr0.loss_cls: 0.5404, sr0.loss_bbox: 6.2415, loss: 47.0919, grad_norm: 154.8938 lr: 1.000e-04, eta: 2 days, 2:41:20, time: 1.416, data_time: 0.150, memory: 6889, s0.loss_cls: 0.9822, s0.loss_bbox: 56.3863, sr0.loss_cls: 0.5780, sr0.loss_bbox: 11.5071, loss: 69.4536, grad_norm: 236.7918 lr: 1.000e-04, eta: 2 days, 2:40:46, time: 1.390, data_time: 0.162, memory: 6889, s0.loss_cls: 0.8897, s0.loss_bbox: 47.3726, sr0.loss_cls: 0.6092, sr0.loss_bbox: 7.9623, loss: 56.8338, grad_norm: 228.0067 lr: 1.000e-04, eta: 2 days, 2:40:06, time: 1.380, data_time: 0.147, memory: 6889, s0.loss_cls: 0.9107, s0.loss_bbox: 55.2884, sr0.loss_cls: 0.5196, sr0.loss_bbox: 6.7891, loss: 63.5078, grad_norm: 265.5788 lr: 1.000e-04, eta: 2 days, 2:39:14, time: 1.351, data_time: 0.159, memory: 6889, s0.loss_cls: nan, s0.loss_bbox: nan, sr0.loss_cls: nan, sr0.loss_bbox: nan, loss: nan, grad_norm: nan lr: 1.000e-04, eta: 2 days, 2:38:04, time: 1.310, data_time: 0.156, memory: 6889, s0.loss_cls: nan, s0.loss_bbox: nan, sr0.loss_cls: nan, sr0.loss_bbox: nan, loss: nan, grad_norm: nan lr: 1.000e-04, eta: 2 days, 2:37:07, time: 1.341, data_time: 0.156, memory: 6889, s0.loss_cls: nan, s0.loss_bbox: nan, sr0.loss_cls: nan, sr0.loss_bbox: nan, loss: nan, grad_norm: nan lr: 1.000e-04, eta: 2 days, 2:36:04, time: 1.327, data_time: 0.160, memory: 6889, s0.loss_cls: nan, s0.loss_bbox: nan, sr0.loss_cls: nan, sr0.loss_bbox: nan, loss: nan, grad_norm: nan lr: 1.000e-04, eta: 2 days, 2:34:41, time: 1.283, data_time: 0.154, memory: 6889, s0.loss_cls: nan, s0.loss_bbox: nan, sr0.loss_cls: nan, sr0.loss_bbox: nan, loss: nan, grad_norm: nan lr: 1.000e-04, eta: 2 days, 2:33:39, time: 1.328, data_time: 0.166, memory: 6889, s0.loss_cls: nan, s0.loss_bbox: nan, sr0.loss_cls: nan, sr0.loss_bbox: nan, loss: nan, grad_norm: nan lr: 1.000e-04, eta: 2 days, 2:32:39, time: 1.335, data_time: 0.162, memory: 6889, s0.loss_cls: nan, s0.loss_bbox: nan, sr0.loss_cls: nan, sr0.loss_bbox: nan, loss: nan, grad_norm: nan lr: 1.000e-04, eta: 2 days, 2:31:39, time: 1.334, data_time: 0.157, memory: 6889, s0.loss_cls: nan, s0.loss_bbox: nan, sr0.loss_cls: nan, sr0.loss_bbox: nan, loss: nan, grad_norm: nan lr: 1.000e-04, eta: 2 days, 2:30:14, time: 1.277, data_time: 0.142, memory: 6889, s0.loss_cls: nan, s0.loss_bbox: nan, sr0.loss_cls: nan, sr0.loss_bbox: nan, loss: nan, grad_norm: nan

yangxue0827 commented 2 years ago

r3det+le will cause NAN

xiaoyihit commented 2 years ago

What about roi_trans_kfiou_ln_swin_tiny_fpn_1x_dota_le90.py?

yangxue0827 commented 2 years ago

Is it NAN too?

xiaoyihit commented 2 years ago

nah, it is map 0.

yangxue0827 commented 2 years ago

I will do some experiments about roi_trans_kfiou_ln_swin_tiny_fpn_1x_dota_le90, but need some time. We also need your feedback, especially about r3det_kfiou_ln_swin_tiny_adamw_fpn_1x_dota_ms_rr_oc.

yangxue0827 commented 2 years ago

I trained roi_trans_r50_fpn_1x_dota_le90.py and roi_trans_kfiou_ln_r50_fpn_1x_dota_le90.py was successful, but roi_trans_kfiou_ln_swin_tiny_fpn_1x_dota_le90.py failed with normal training log. I'll keep debugging.

yangxue0827 commented 2 years ago

https://github.com/open-mmlab/mmrotate/blob/df125d7121d9d4074d1ccfdec60fd65ce58ff7d8/configs/kfiou/roi_trans_kfiou_ln_swin_tiny_fpn_1x_dota_le90.py#L24-L31

change into

neck=dict(
        _delete_=True,
        type='FPN',
        in_channels=[96, 192, 384, 768],
        out_channels=256,
        add_extra_convs='on_input',
        num_outs=5)

@xiaoyihit

yangxue0827 commented 2 years ago

We find that the release version of mmrotate's kfiou+roi trans doesn't seem to be any better than the roi trans. Since the experiments in the paper are run on the code before the release version, and the release version has undergone many code refactorings, further parameter adjustment for kfiou+roi trans may be required. As you can see, the release version of mmrotate needs to integrate many methods, so there may be bad configuration files. However, we have released the weights and logs for the correct configuration files, and will add more models in the future.

lina926 commented 2 years ago

https://github.com/open-mmlab/mmrotate/blob/df125d7121d9d4074d1ccfdec60fd65ce58ff7d8/configs/kfiou/roi_trans_kfiou_ln_swin_tiny_fpn_1x_dota_le90.py#L24-L31

change into
neck=dict(
        _delete_=True,
        type='FPN',
        in_channels=[96, 192, 384, 768],
        out_channels=256,
        add_extra_convs='on_input',
        num_outs=5)
@xiaoyihit

it will cause cuda out of memary

yangxue0827 commented 2 years ago

add data = dict(samples_per_gpu=1, workers_per_gpu=1) refer to https://github.com/open-mmlab/mmrotate/blob/df125d7121d9d4074d1ccfdec60fd65ce58ff7d8/configs/roi_trans/roi_trans_swin_tiny_fpn_1x_dota_le90.py#L31

xiaoyihit commented 2 years ago

We find that the release version of mmrotate's kfiou+roi trans doesn't seem to be any better than the roi trans. Since the experiments in the paper are run on the code before the release version, and the release version has undergone many code refactorings, further parameter adjustment for kfiou+roi trans may be required. As you can see, the release version of mmrotate needs to integrate many methods, so there may be bad configuration files. However, we have released the weights and logs for the correct configuration files, and will add more models in the future.

Just finished eval. This is your result for task 1:

mAP: 0.750360501827728 ap of each class: plane:0.894929653799662, baseball-diamond:0.7731615899566354, bridge:0.5157623225260385, ground-track-field:0.734778658651271, small-vehicle:0.7860644079039538, large-vehicle:0.8159744191901619, ship:0.8772641293948107, tennis-court:0.9089554260940254, basketball-court:0.8580533136030519, storage-tank:0.8515154228628408, soccer-ball-field:0.6271493722082745, roundabout:0.6387006969394813, harbor:0.6737986751942772, swimming-pool:0.7237979777030293, helicopter:0.5755014613884054 The submitted information is :

Description: r3det_kfiou_ln_swin_tiny_adamw_fpn_1x_dota_ms_rr_oc\Task1_results

yangxue0827 commented 2 years ago

Single GPU

This is your result for task 1:

mAP: 0.7639999397376762 ap of each class: plane:0.8920492711808281, baseball-diamond:0.8311689917311053, bridge:0.5487773220207339, ground-track-field:0.715685100739603, small-vehicle:0.7891244562305876, large-vehicle:0.8295689284111806, ship:0.8804525056013857, tennis-court:0.9089562289562291, basketball-court:0.8732825143127781, storage-tank:0.8590706510255619, soccer-ball-field:0.6396620876719492, roundabout:0.6472101230008205, harbor:0.7628661551618027, swimming-pool:0.7013281937144408, helicopter:0.5807965663061383 The submitted information is :

Description: roi_trans_r50_fpn_1x_dota_le90

This is your result for task 1:

mAP: 0.7577335010077175 ap of each class: plane:0.8894649588480816, baseball-diamond:0.8290480970945256, bridge:0.5548931517545318, ground-track-field:0.7173129226267527, small-vehicle:0.7898956105813991, large-vehicle:0.8167135005240188, ship:0.8793419879806488, tennis-court:0.9090909090909093, basketball-court:0.8708165080114506, storage-tank:0.8577658662597724, soccer-ball-field:0.6529545308196423, roundabout:0.6157234981467674, harbor:0.752155680103153, swimming-pool:0.7138769381846977, helicopter:0.5169483550894158 The submitted information is :

Description: roi_trans_kfiou_ln_r50_fpn_1x_dota_le90

yangxue0827 commented 2 years ago

if you use multu-gpu, lr need to be modified, lr=lr*gpu_num.

xiaoyihit commented 2 years ago

if you use multu-gpu, lr need to be modified, lr=lr*gpu_num.

I am using single gpu.

xiaoyihit commented 2 years ago

Single GPU

This is your result for task 1:

mAP: 0.7639999397376762 ap of each class: plane:0.8920492711808281, baseball-diamond:0.8311689917311053, bridge:0.5487773220207339, ground-track-field:0.715685100739603, small-vehicle:0.7891244562305876, large-vehicle:0.8295689284111806, ship:0.8804525056013857, tennis-court:0.9089562289562291, basketball-court:0.8732825143127781, storage-tank:0.8590706510255619, soccer-ball-field:0.6396620876719492, roundabout:0.6472101230008205, harbor:0.7628661551618027, swimming-pool:0.7013281937144408, helicopter:0.5807965663061383 The submitted information is :

Description: roi_trans_r50_fpn_1x_dota_le90

This is your result for task 1:

mAP: 0.7577335010077175 ap of each class: plane:0.8894649588480816, baseball-diamond:0.8290480970945256, bridge:0.5548931517545318, ground-track-field:0.7173129226267527, small-vehicle:0.7898956105813991, large-vehicle:0.8167135005240188, ship:0.8793419879806488, tennis-court:0.9090909090909093, basketball-court:0.8708165080114506, storage-tank:0.8577658662597724, soccer-ball-field:0.6529545308196423, roundabout:0.6157234981467674, harbor:0.752155680103153, swimming-pool:0.7138769381846977, helicopter:0.5169483550894158 The submitted information is :

Description: roi_trans_kfiou_ln_r50_fpn_1x_dota_le90

similar results. configs/kfiou/roi_trans_kfiou_ln_r50_fpn_1x_dota_le90.py 0.7538 configs/kfiou/roi_trans_kfiou_ln_r50_fpn_1x_dota_ms_rr_le90.py 0.7660

yangxue0827 commented 2 years ago

Gaussian based methods (e.g. gwd, kld ang kfiou) are sensitive to loss weight. Tuning may be required for kfiou in two-stage mehod.

PS: roi_trans_kfiou_ln_r50_fpn_1x_dota_ms_rr_le90 is used to train in ms_trainval set and test in ms_test set. ms_trainval set and ms_test set are splited by

xiaoyihit commented 2 years ago

Gaussian based methods (e.g. gwd, kld ang kfiou) are sensitive to loss weight. Tuning may be required for kfiou in two-stage mehod.

PS: roi_trans_kfiou_ln_r50_fpn_1x_dota_ms_rr_le90 is used to train in ms_trainval set and test in ms_test set. ms_trainval set and ms_test set are splited by

Thx for pointing out the mistake. Seems that I forgot the multi-scale config, I will check that out soon.

xiaoyihit commented 2 years ago

``> https://github.com/open-mmlab/mmrotate/blob/df125d7121d9d4074d1ccfdec60fd65ce58ff7d8/configs/kfiou/roi_trans_kfiou_ln_swin_tiny_fpn_1x_dota_le90.py#L24-L31

change into

neck=dict(
        _delete_=True,
        type='FPN',
        in_channels=[96, 192, 384, 768],
        out_channels=256,
        add_extra_convs='on_input',
        num_outs=5)

@xiaoyihit

There seems to be a bug,

Traceback (most recent call last):
  File "tools/train.py", line 252, in <module>
    main()
  File "tools/train.py", line 246, in main
    meta=meta)
  File "/remote-home/xiaoyi/mmrotate-main/mmrotate/apis/train.py", line 156, in train_detector
    runner.run(data_loaders, cfg.workflow)
  File "/opt/conda/envs/mmrotatev2/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 127, in run
    epoch_runner(data_loaders[i], **kwargs)
  File "/opt/conda/envs/mmrotatev2/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 50, in train
    self.run_iter(data_batch, train_mode=True, **kwargs)
  File "/opt/conda/envs/mmrotatev2/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 30, in run_iter
    **kwargs)
  File "/opt/conda/envs/mmrotatev2/lib/python3.7/site-packages/mmcv/parallel/data_parallel.py", line 75, in train_step
    return self.module.train_step(*inputs[0], **kwargs[0])
  File "/opt/conda/envs/mmrotatev2/lib/python3.7/site-packages/mmdet/models/detectors/base.py", line 248, in train_step
    losses = self(**data)
  File "/opt/conda/envs/mmrotatev2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/opt/conda/envs/mmrotatev2/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 109, in new_func
    return old_func(*args, **kwargs)
  File "/opt/conda/envs/mmrotatev2/lib/python3.7/site-packages/mmdet/models/detectors/base.py", line 172, in forward
    return self.forward_train(img, img_metas, **kwargs)
  File "/remote-home/xiaoyi/mmrotate-main/mmrotate/models/detectors/two_stage.py", line 150, in forward_train
    **kwargs)
  File "/remote-home/xiaoyi/mmrotate-main/mmrotate/models/roi_heads/roi_trans_roi_head.py", line 238, in forward_train
    rcnn_train_cfg)
  File "/remote-home/xiaoyi/mmrotate-main/mmrotate/models/roi_heads/roi_trans_roi_head.py", line 155, in _bbox_forward_train
    bbox_results = self._bbox_forward(stage, x, rois)
  File "/remote-home/xiaoyi/mmrotate-main/mmrotate/models/roi_heads/roi_trans_roi_head.py", line 126, in _bbox_forward
    rois)
  File "/opt/conda/envs/mmrotatev2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/opt/conda/envs/mmrotatev2/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 197, in new_func
    return old_func(*args, **kwargs)
  File "/remote-home/xiaoyi/mmrotate-main/mmrotate/models/roi_heads/roi_extractors/rotate_single_level_roi_extractor.py", line 133, in forward
    roi_feats_t = self.roi_layers[i](feats[i], rois_)
  File "/opt/conda/envs/mmrotatev2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "/opt/conda/envs/mmrotatev2/lib/python3.7/site-packages/mmcv/ops/roi_align_rotated.py", line 171, in forward
    self.clockwise)
  File "/opt/conda/envs/mmrotatev2/lib/python3.7/site-packages/mmcv/ops/roi_align_rotated.py", line 70, in forward
    clockwise=ctx.clockwise)
RuntimeError: CUDA error: an illegal memory access was encountered
terminate called after throwing an instance of 'c10::CUDAError'
  what():  CUDA error: an illegal memory access was encountered
Exception raised from create_event_internal at /opt/conda/conda-bld/pytorch_1634272178570/work/c10/cuda/CUDACachingAllocator.cpp:1211 (most recent call first):
frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x42 (0x7f0606f6bd62 in /opt/conda/envs/mmrotatev2/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: <unknown function> + 0x1c613 (0x7f065e85a613 in /opt/conda/envs/mmrotatev2/lib/python3.7/site-packages/torch/lib/libc10_cuda.so)
frame #2: c10::cuda::CUDACachingAllocator::raw_delete(void*) + 0x1a2 (0x7f065e85b022 in /opt/conda/envs/mmrotatev2/lib/python3.7/site-packages/torch/lib/libc10_cuda.so)
frame #3: c10::TensorImpl::release_resources() + 0xa4 (0x7f0606f55314 in /opt/conda/envs/mmrotatev2/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #4: <unknown function> + 0x294dd9 (0x7f06dc8d2dd9 in /opt/conda/envs/mmrotatev2/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #5: <unknown function> + 0xae2f59 (0x7f06dd120f59 in /opt/conda/envs/mmrotatev2/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #6: THPVariable_subclass_dealloc(_object*) + 0x2b9 (0x7f06dd121279 in /opt/conda/envs/mmrotatev2/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
<omitting python frames>
frame #24: __libc_start_main + 0xe7 (0x7f0717e39bf7 in /lib/x86_64-linux-gnu/libc.so.6)

Aborted (core dumped)

My config:

dataset_type = 'DOTADataset'
data_root = '/remote-home/xiaoyi/datasets/dotav1/split_1024_dota1_0/'
img_norm_cfg = dict(
    mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(type='LoadAnnotations', with_bbox=True),
    dict(type='RResize', img_scale=(1024, 1024)),
    dict(
        type='RRandomFlip',
        flip_ratio=[0.25, 0.25, 0.25],
        direction=['horizontal', 'vertical', 'diagonal'],
        version='le90'),
    dict(
        type='Normalize',
        mean=[123.675, 116.28, 103.53],
        std=[58.395, 57.12, 57.375],
        to_rgb=True),
    dict(type='Pad', size_divisor=32),
    dict(type='DefaultFormatBundle'),
    dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels'])
]
test_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(
        type='MultiScaleFlipAug',
        img_scale=(1024, 1024),
        flip=False,
        transforms=[
            dict(type='RResize'),
            dict(
                type='Normalize',
                mean=[123.675, 116.28, 103.53],
                std=[58.395, 57.12, 57.375],
                to_rgb=True),
            dict(type='Pad', size_divisor=32),
            dict(type='DefaultFormatBundle'),
            dict(type='Collect', keys=['img'])
        ])
]
data = dict(
    samples_per_gpu=2,
    workers_per_gpu=2,
    train=dict(
        type='DOTADataset',
        ann_file=
        '/remote-home/xiaoyi/datasets/dotav1/split_1024_dota1_0/trainval/annfiles/',
        img_prefix=
        '/remote-home/xiaoyi/datasets/dotav1/split_1024_dota1_0/trainval/images/',
        pipeline=[
            dict(type='LoadImageFromFile'),
            dict(type='LoadAnnotations', with_bbox=True),
            dict(type='RResize', img_scale=(1024, 1024)),
            dict(
                type='RRandomFlip',
                flip_ratio=[0.25, 0.25, 0.25],
                direction=['horizontal', 'vertical', 'diagonal'],
                version='le90'),
            dict(
                type='Normalize',
                mean=[123.675, 116.28, 103.53],
                std=[58.395, 57.12, 57.375],
                to_rgb=True),
            dict(type='Pad', size_divisor=32),
            dict(type='DefaultFormatBundle'),
            dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels'])
        ],
        version='le90'),
    val=dict(
        type='DOTADataset',
        ann_file=
        '/remote-home/xiaoyi/datasets/dotav1/split_1024_dota1_0/trainval/annfiles/',
        img_prefix=
        '/remote-home/xiaoyi/datasets/dotav1/split_1024_dota1_0/trainval/images/',
        pipeline=[
            dict(type='LoadImageFromFile'),
            dict(
                type='MultiScaleFlipAug',
                img_scale=(1024, 1024),
                flip=False,
                transforms=[
                    dict(type='RResize'),
                    dict(
                        type='Normalize',
                        mean=[123.675, 116.28, 103.53],
                        std=[58.395, 57.12, 57.375],
                        to_rgb=True),
                    dict(type='Pad', size_divisor=32),
                    dict(type='DefaultFormatBundle'),
                    dict(type='Collect', keys=['img'])
                ])
        ],
        version='le90'),
    test=dict(
        type='DOTADataset',
        ann_file=
        '/remote-home/xiaoyi/datasets/dotav1/split_1024_dota1_0/test/images/',
        img_prefix=
        '/remote-home/xiaoyi/datasets/dotav1/split_1024_dota1_0/test/images/',
        pipeline=[
            dict(type='LoadImageFromFile'),
            dict(
                type='MultiScaleFlipAug',
                img_scale=(1024, 1024),
                flip=False,
                transforms=[
                    dict(type='RResize'),
                    dict(
                        type='Normalize',
                        mean=[123.675, 116.28, 103.53],
                        std=[58.395, 57.12, 57.375],
                        to_rgb=True),
                    dict(type='Pad', size_divisor=32),
                    dict(type='DefaultFormatBundle'),
                    dict(type='Collect', keys=['img'])
                ])
        ],
        version='le90'))
evaluation = dict(interval=12, metric='mAP')
optimizer = dict(
    type='AdamW',
    lr=0.0001,
    betas=(0.9, 0.999),
    weight_decay=0.05,
    paramwise_cfg=dict(
        custom_keys=dict(
            absolute_pos_embed=dict(decay_mult=0.0),
            relative_position_bias_table=dict(decay_mult=0.0),
            norm=dict(decay_mult=0.0))))
optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2))
lr_config = dict(
    policy='step',
    warmup='linear',
    warmup_iters=500,
    warmup_ratio=0.3333333333333333,
    step=[8, 11])
runner = dict(type='EpochBasedRunner', max_epochs=12)
checkpoint_config = dict(interval=12)
log_config = dict(interval=50, hooks=[dict(type='TextLoggerHook')])
dist_params = dict(backend='nccl')
log_level = 'INFO'
load_from = None
resume_from = None
workflow = [('train', 1)]
angle_version = 'le90'
model = dict(
    type='RoITransformer',
    backbone=dict(
        type='SwinTransformer',
        embed_dims=96,
        depths=[2, 2, 6, 2],
        num_heads=[3, 6, 12, 24],
        window_size=7,
        mlp_ratio=4,
        qkv_bias=True,
        qk_scale=None,
        drop_rate=0.0,
        attn_drop_rate=0.0,
        drop_path_rate=0.2,
        patch_norm=True,
        out_indices=(0, 1, 2, 3),
        with_cp=False,
        convert_weights=True,
        init_cfg=dict(
            type='Pretrained',
            checkpoint=
            'https://github.com/SwinTransformer/storage/releases/download/v1.0.0/swin_tiny_patch4_window7_224.pth'
        )),
    neck=dict(
        type='FPN',
        in_channels=[96, 192, 384, 768],
        out_channels=256,
        add_extra_convs='on_input',
        num_outs=5),
    rpn_head=dict(
        type='RotatedRPNHead',
        in_channels=256,
        feat_channels=256,
        version='le90',
        anchor_generator=dict(
            type='AnchorGenerator',
            scales=[8],
            ratios=[0.5, 1.0, 2.0],
            strides=[4, 8, 16, 32, 64]),
        bbox_coder=dict(
            type='DeltaXYWHBBoxCoder',
            target_means=[0.0, 0.0, 0.0, 0.0],
            target_stds=[1.0, 1.0, 1.0, 1.0]),
        loss_cls=dict(
            type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
        loss_bbox=dict(
            type='SmoothL1Loss', beta=0.1111111111111111, loss_weight=1.0)),
    roi_head=dict(
        type='RoITransRoIHead',
        version='le90',
        num_stages=2,
        stage_loss_weights=[1, 1],
        bbox_roi_extractor=[
            dict(
                type='SingleRoIExtractor',
                roi_layer=dict(
                    type='RoIAlign', output_size=7, sampling_ratio=0),
                out_channels=256,
                featmap_strides=[4, 8, 16, 32]),
            dict(
                type='RotatedSingleRoIExtractor',
                roi_layer=dict(
                    type='RoIAlignRotated',
                    out_size=7,
                    sample_num=2,
                    clockwise=True),
                out_channels=256,
                featmap_strides=[4, 8, 16, 32])
        ],
        bbox_head=[
            dict(
                type='RotatedShared2FCBBoxHead',
                in_channels=256,
                fc_out_channels=1024,
                roi_feat_size=7,
                num_classes=15,
                bbox_coder=dict(
                    type='DeltaXYWHAHBBoxCoder',
                    angle_range='le90',
                    norm_factor=2,
                    edge_swap=True,
                    target_means=[0.0, 0.0, 0.0, 0.0, 0.0],
                    target_stds=[0.1, 0.1, 0.2, 0.2, 1]),
                reg_class_agnostic=True,
                loss_cls=dict(
                    type='CrossEntropyLoss',
                    use_sigmoid=False,
                    loss_weight=1.0),
                loss_bbox=dict(
                    type='SmoothL1Loss',
                    beta=0.1111111111111111,
                    loss_weight=1.0)),
            dict(
                type='RotatedKFIoUShared2FCBBoxHead',
                in_channels=256,
                fc_out_channels=1024,
                roi_feat_size=7,
                num_classes=15,
                bbox_coder=dict(
                    type='DeltaXYWHAOBBoxCoder',
                    angle_range='le90',
                    norm_factor=None,
                    edge_swap=True,
                    proj_xy=True,
                    target_means=[0.0, 0.0, 0.0, 0.0, 0.0],
                    target_stds=[0.05, 0.05, 0.1, 0.1, 0.5]),
                reg_class_agnostic=False,
                loss_cls=dict(
                    type='CrossEntropyLoss',
                    use_sigmoid=False,
                    loss_weight=1.0),
                loss_bbox=dict(type='KFLoss', fun='ln', loss_weight=0.5))
        ]),
    train_cfg=dict(
        rpn=dict(
            assigner=dict(
                type='MaxIoUAssigner',
                pos_iou_thr=0.7,
                neg_iou_thr=0.3,
                min_pos_iou=0.3,
                match_low_quality=True,
                ignore_iof_thr=-1),
            sampler=dict(
                type='RandomSampler',
                num=256,
                pos_fraction=0.5,
                neg_pos_ub=-1,
                add_gt_as_proposals=False),
            allowed_border=0,
            pos_weight=-1,
            debug=False),
        rpn_proposal=dict(
            nms_pre=2000,
            max_per_img=2000,
            nms=dict(type='nms', iou_threshold=0.7),
            min_bbox_size=0),
        rcnn=[
            dict(
                assigner=dict(
                    type='MaxIoUAssigner',
                    pos_iou_thr=0.5,
                    neg_iou_thr=0.5,
                    min_pos_iou=0.5,
                    match_low_quality=False,
                    ignore_iof_thr=-1,
                    iou_calculator=dict(type='BboxOverlaps2D')),
                sampler=dict(
                    type='RandomSampler',
                    num=512,
                    pos_fraction=0.25,
                    neg_pos_ub=-1,
                    add_gt_as_proposals=True),
                pos_weight=-1,
                debug=False),
            dict(
                assigner=dict(
                    type='MaxIoUAssigner',
                    pos_iou_thr=0.5,
                    neg_iou_thr=0.5,
                    min_pos_iou=0.5,
                    match_low_quality=False,
                    ignore_iof_thr=-1,
                    iou_calculator=dict(type='RBboxOverlaps2D')),
                sampler=dict(
                    type='RRandomSampler',
                    num=512,
                    pos_fraction=0.25,
                    neg_pos_ub=-1,
                    add_gt_as_proposals=True),
                pos_weight=-1,
                debug=False)
        ]),
    test_cfg=dict(
        rpn=dict(
            nms_pre=2000,
            max_per_img=2000,
            nms=dict(type='nms', iou_threshold=0.7),
            min_bbox_size=0),
        rcnn=dict(
            nms_pre=2000,
            min_bbox_size=0,
            score_thr=0.05,
            nms=dict(type='le90', iou_thr=0.1),
            max_per_img=2000)))
pretrained = 'https://github.com/SwinTransformer/storage/releases/download/v1.0.0/swin_tiny_patch4_window7_224.pth'
find_unused_parameters = True
work_dir = './work_dirs/roi_trans_kfiou_ln_swin_tiny_fpn_1x_dota_le90'
auto_resume = False
gpu_ids = [1]

yangxue0827 commented 2 years ago

I can run normally, on a 2080ti gpu. roi_trans_kfiou_ln_swin_tiny_fpn_1x_dota_le90.txt

xiaoyihit commented 2 years ago

mAP: 0.7907676910846436 ap of each class: plane:0.8951765569961131, baseball-diamond:0.8326362977155665, bridge:0.5660402122647823, ground-track-field:0.7565563051287862, small-vehicle:0.8090605281971763, large-vehicle:0.8372130002571467, ship:0.8844067743199389, tennis-court:0.9088127411811229, basketball-court:0.8536264881156366, storage-tank:0.8752888005548948, soccer-ball-field:0.6697054274046957, roundabout:0.7047203486881632, harbor:0.775892803409076, swimming-pool:0.7752507482116147, helicopter:0.7171283338249412 COCO style result: AP50: 0.7907676910846436 AP75: 0.4811109336025689 mAP: 0.47054638715884184 The submitted information is : Description: r3det_kfiou_ln_swin_tiny_adamw_fpn_2x_dota_ms_rr_oc\Task1_results Still way too far away from 80.90%

yangxue0827 commented 2 years ago

This is mine:

r3det_kfiou_ln_swin_tiny_adamw_fpn_2x_dota_ms_rr_oc.txt

yangxue0827 commented 2 years ago

20220424_185126.log.json.txt Partial log.

xiaoyihit commented 2 years ago

20220422_104626.log.json.txt r3det_kfiou_ln_swin_tiny_adamw_fpn_2x_dota_ms_rr_oc.txt config and log

open-mmlab / mmrotate

kfiou with swin_tiny reimplementation problems #236