yuxiangsun / RTFNet

RGB-Thermal Fusion Network for Semantic Segmentation of Urban Scenes
MIT License
166 stars 39 forks source link

Training script #4

Closed celynw closed 4 years ago

celynw commented 5 years ago

I've been able to reproduce the results in your paper using the pretrained weights (RTFNet 152). Since you didn't provide a training script, I've used my own. However I wasn't able to get anywhere close to the same mAcc and mIoU.

Do you plan to release your training script? Could you perhaps detail your procedure?

I have tried the same parameters detailed in your paper, and freezing the ResNET weights.

celynw commented 5 years ago

(Edited to add more information)

echoofluoc commented 5 years ago

I have the same doubt like @celynwalters , the provided RTFNet-50 weight file isn't correct(only 8.5% mIoU), and I quite curious about result on RTFNet-34, Fig 3 in your paper says it gets 26% mIoU, can you also upload that weight @yuxiangsun Besides I find there's a problem in the test code: Line 61-63 in test.py : start_time = time.time() logits = model(images) end_time = time.time() where we need torch.cuda.synchronize() to ensure the CUDA core is synchronized with the CPU, otherwise the time measurement may not be correct: torch.cuda.synchronize() start = time.time() logits = model(images) torch.cuda.synchronize() end = time.time() Leave the wrong RTFNet-50 weight file alone(this weight file isn't correct), when I test the RTFNet-152 weight file (a TITAN Xp GPU and no other process running on it), I get

| the gpu count: 2 | the current used gpu: 0

| use the final model file. | testing RTFNet: RTFNet_152 on GPU #0 with pytorch | loading model file ./weights_backup/RTFNet_152/final.pth... done! | frame 1/393, time cost: 220.16 ms | frame 2/393, time cost: 86.85 ms | frame 3/393, time cost: 84.46 ms | frame 4/393, time cost: 82.24 ms | frame 5/393, time cost: 96.72 ms | frame 6/393, time cost: 85.46 ms | frame 7/393, time cost: 106.69 ms | frame 8/393, time cost: 83.34 ms | frame 9/393, time cost: 82.99 ms | frame 10/393, time cost: 85.59 ms | frame 11/393, time cost: 85.26 ms | frame 12/393, time cost: 82.08 ms | frame 13/393, time cost: 81.99 ms | frame 14/393, time cost: 83.79 ms | frame 15/393, time cost: 84.02 ms | frame 16/393, time cost: 85.81 ms | frame 17/393, time cost: 84.79 ms | frame 18/393, time cost: 82.35 ms | frame 19/393, time cost: 85.46 ms | frame 20/393, time cost: 85.80 ms | frame 21/393, time cost: 85.78 ms | frame 22/393, time cost: 81.98 ms | frame 23/393, time cost: 86.75 ms | frame 24/393, time cost: 83.61 ms | frame 25/393, time cost: 86.35 ms | frame 26/393, time cost: 82.92 ms | frame 27/393, time cost: 89.65 ms | frame 28/393, time cost: 86.23 ms | frame 29/393, time cost: 82.18 ms | frame 30/393, time cost: 84.70 ms | frame 31/393, time cost: 85.58 ms | frame 32/393, time cost: 85.67 ms | frame 33/393, time cost: 84.11 ms | frame 34/393, time cost: 83.37 ms | frame 35/393, time cost: 85.22 ms | frame 36/393, time cost: 86.16 ms | frame 37/393, time cost: 101.67 ms | frame 38/393, time cost: 85.83 ms | frame 39/393, time cost: 85.43 ms | frame 40/393, time cost: 83.34 ms | frame 41/393, time cost: 84.87 ms | frame 42/393, time cost: 86.39 ms | frame 43/393, time cost: 85.10 ms | frame 44/393, time cost: 83.32 ms | frame 45/393, time cost: 85.29 ms | frame 46/393, time cost: 84.51 ms | frame 47/393, time cost: 84.89 ms | frame 48/393, time cost: 83.27 ms | frame 49/393, time cost: 99.04 ms | frame 50/393, time cost: 85.45 ms | frame 51/393, time cost: 85.94 ms | frame 52/393, time cost: 82.29 ms | frame 53/393, time cost: 103.65 ms | frame 54/393, time cost: 86.40 ms | frame 55/393, time cost: 91.47 ms | frame 56/393, time cost: 81.98 ms | frame 57/393, time cost: 92.15 ms | frame 58/393, time cost: 85.73 ms | frame 59/393, time cost: 85.78 ms | frame 60/393, time cost: 100.76 ms | frame 61/393, time cost: 82.69 ms | frame 62/393, time cost: 82.89 ms | frame 63/393, time cost: 82.70 ms | frame 64/393, time cost: 82.91 ms | frame 65/393, time cost: 100.10 ms | frame 66/393, time cost: 82.31 ms | frame 67/393, time cost: 82.32 ms | frame 68/393, time cost: 85.65 ms | frame 69/393, time cost: 85.85 ms | frame 70/393, time cost: 86.50 ms | frame 71/393, time cost: 86.36 ms | frame 72/393, time cost: 85.87 ms | frame 73/393, time cost: 85.88 ms | frame 74/393, time cost: 85.81 ms | frame 75/393, time cost: 85.95 ms | frame 76/393, time cost: 85.98 ms | frame 77/393, time cost: 103.16 ms | frame 78/393, time cost: 82.44 ms | frame 79/393, time cost: 104.61 ms | frame 80/393, time cost: 86.85 ms | frame 81/393, time cost: 100.34 ms | frame 82/393, time cost: 82.63 ms | frame 83/393, time cost: 85.52 ms | frame 84/393, time cost: 85.46 ms | frame 85/393, time cost: 83.84 ms | frame 86/393, time cost: 83.06 ms | frame 87/393, time cost: 83.63 ms | frame 88/393, time cost: 103.76 ms | frame 89/393, time cost: 86.33 ms | frame 90/393, time cost: 103.54 ms | frame 91/393, time cost: 82.82 ms | frame 92/393, time cost: 83.70 ms | frame 93/393, time cost: 85.72 ms | frame 94/393, time cost: 86.52 ms | frame 95/393, time cost: 84.62 ms | frame 96/393, time cost: 86.91 ms | frame 97/393, time cost: 87.66 ms | frame 98/393, time cost: 83.77 ms | frame 99/393, time cost: 87.08 ms | frame 100/393, time cost: 86.98 ms | frame 101/393, time cost: 82.87 ms | frame 102/393, time cost: 86.85 ms | frame 103/393, time cost: 106.53 ms | frame 104/393, time cost: 85.81 ms | frame 105/393, time cost: 85.46 ms | frame 106/393, time cost: 99.33 ms | frame 107/393, time cost: 86.35 ms | frame 108/393, time cost: 84.42 ms | frame 109/393, time cost: 82.21 ms | frame 110/393, time cost: 83.69 ms | frame 111/393, time cost: 82.95 ms | frame 112/393, time cost: 83.92 ms | frame 113/393, time cost: 100.03 ms | frame 114/393, time cost: 82.52 ms | frame 115/393, time cost: 98.21 ms | frame 116/393, time cost: 85.81 ms | frame 117/393, time cost: 85.15 ms | frame 118/393, time cost: 83.32 ms | frame 119/393, time cost: 85.59 ms | frame 120/393, time cost: 88.12 ms | frame 121/393, time cost: 84.48 ms | frame 122/393, time cost: 83.60 ms | frame 123/393, time cost: 86.05 ms | frame 124/393, time cost: 85.72 ms | frame 125/393, time cost: 100.99 ms | frame 126/393, time cost: 83.66 ms | frame 127/393, time cost: 86.57 ms | frame 128/393, time cost: 86.28 ms | frame 129/393, time cost: 104.56 ms | frame 130/393, time cost: 86.59 ms | frame 131/393, time cost: 85.48 ms | frame 132/393, time cost: 85.48 ms | frame 133/393, time cost: 97.89 ms | frame 134/393, time cost: 83.97 ms | frame 135/393, time cost: 99.07 ms | frame 136/393, time cost: 85.75 ms | frame 137/393, time cost: 99.29 ms | frame 138/393, time cost: 83.26 ms | frame 139/393, time cost: 86.21 ms | frame 140/393, time cost: 85.88 ms | frame 141/393, time cost: 84.96 ms | frame 142/393, time cost: 83.45 ms | frame 143/393, time cost: 104.84 ms | frame 144/393, time cost: 85.71 ms | frame 145/393, time cost: 84.93 ms | frame 146/393, time cost: 104.85 ms | frame 147/393, time cost: 83.14 ms | frame 148/393, time cost: 84.31 ms | frame 149/393, time cost: 85.54 ms | frame 150/393, time cost: 85.08 ms | frame 151/393, time cost: 82.45 ms | frame 152/393, time cost: 82.83 ms | frame 153/393, time cost: 83.29 ms | frame 154/393, time cost: 83.23 ms | frame 155/393, time cost: 83.58 ms | frame 156/393, time cost: 85.92 ms | frame 157/393, time cost: 83.26 ms | frame 158/393, time cost: 82.73 ms | frame 159/393, time cost: 83.74 ms | frame 160/393, time cost: 87.01 ms | frame 161/393, time cost: 87.00 ms | frame 162/393, time cost: 83.55 ms | frame 163/393, time cost: 84.13 ms | frame 164/393, time cost: 85.87 ms | frame 165/393, time cost: 85.47 ms | frame 166/393, time cost: 82.90 ms | frame 167/393, time cost: 83.35 ms | frame 168/393, time cost: 85.82 ms | frame 169/393, time cost: 101.64 ms | frame 170/393, time cost: 82.48 ms | frame 171/393, time cost: 82.88 ms | frame 172/393, time cost: 86.36 ms | frame 173/393, time cost: 86.43 ms | frame 174/393, time cost: 83.09 ms | frame 175/393, time cost: 84.04 ms | frame 176/393, time cost: 99.36 ms | frame 177/393, time cost: 85.49 ms | frame 178/393, time cost: 100.55 ms | frame 179/393, time cost: 86.06 ms | frame 180/393, time cost: 85.24 ms | frame 181/393, time cost: 86.48 ms | frame 182/393, time cost: 82.97 ms | frame 183/393, time cost: 83.82 ms | frame 184/393, time cost: 100.91 ms | frame 185/393, time cost: 87.10 ms | frame 186/393, time cost: 100.10 ms | frame 187/393, time cost: 83.31 ms | frame 188/393, time cost: 102.25 ms | frame 189/393, time cost: 86.36 ms | frame 190/393, time cost: 83.49 ms | frame 191/393, time cost: 86.37 ms | frame 192/393, time cost: 86.72 ms | frame 193/393, time cost: 85.07 ms | frame 194/393, time cost: 87.11 ms | frame 195/393, time cost: 91.19 ms | frame 196/393, time cost: 105.97 ms | frame 197/393, time cost: 86.60 ms | frame 198/393, time cost: 84.81 ms | frame 199/393, time cost: 82.24 ms | frame 200/393, time cost: 82.25 ms | frame 201/393, time cost: 87.25 ms | frame 202/393, time cost: 105.06 ms | frame 203/393, time cost: 83.19 ms | frame 204/393, time cost: 85.98 ms | frame 205/393, time cost: 105.63 ms | frame 206/393, time cost: 86.41 ms | frame 207/393, time cost: 99.98 ms | frame 208/393, time cost: 84.74 ms | frame 209/393, time cost: 106.34 ms | frame 210/393, time cost: 85.53 ms | frame 211/393, time cost: 85.37 ms | frame 212/393, time cost: 86.75 ms | frame 213/393, time cost: 83.06 ms | frame 214/393, time cost: 86.54 ms | frame 215/393, time cost: 83.41 ms | frame 216/393, time cost: 106.22 ms | frame 217/393, time cost: 86.29 ms | frame 218/393, time cost: 106.71 ms | frame 219/393, time cost: 84.19 ms | frame 220/393, time cost: 86.34 ms | frame 221/393, time cost: 86.33 ms | frame 222/393, time cost: 83.44 ms | frame 223/393, time cost: 102.08 ms | frame 224/393, time cost: 82.52 ms | frame 225/393, time cost: 83.33 ms | frame 226/393, time cost: 83.19 ms | frame 227/393, time cost: 82.72 ms | frame 228/393, time cost: 82.87 ms | frame 229/393, time cost: 87.39 ms | frame 230/393, time cost: 83.72 ms | frame 231/393, time cost: 83.38 ms | frame 232/393, time cost: 86.01 ms | frame 233/393, time cost: 83.01 ms | frame 234/393, time cost: 86.57 ms | frame 235/393, time cost: 84.97 ms | frame 236/393, time cost: 86.38 ms | frame 237/393, time cost: 83.33 ms | frame 238/393, time cost: 85.74 ms | frame 239/393, time cost: 91.95 ms | frame 240/393, time cost: 85.95 ms | frame 241/393, time cost: 85.92 ms | frame 242/393, time cost: 103.53 ms | frame 243/393, time cost: 86.55 ms | frame 244/393, time cost: 86.05 ms | frame 245/393, time cost: 85.60 ms | frame 246/393, time cost: 86.71 ms | frame 247/393, time cost: 83.07 ms | frame 248/393, time cost: 106.63 ms | frame 249/393, time cost: 84.45 ms | frame 250/393, time cost: 84.60 ms | frame 251/393, time cost: 85.09 ms | frame 252/393, time cost: 86.18 ms | frame 253/393, time cost: 83.26 ms | frame 254/393, time cost: 82.78 ms | frame 255/393, time cost: 82.52 ms | frame 256/393, time cost: 84.67 ms | frame 257/393, time cost: 86.28 ms | frame 258/393, time cost: 84.33 ms | frame 259/393, time cost: 86.34 ms | frame 260/393, time cost: 87.51 ms | frame 261/393, time cost: 82.32 ms | frame 262/393, time cost: 86.92 ms | frame 263/393, time cost: 84.29 ms | frame 264/393, time cost: 85.27 ms | frame 265/393, time cost: 86.53 ms | frame 266/393, time cost: 82.66 ms | frame 267/393, time cost: 87.01 ms | frame 268/393, time cost: 105.62 ms | frame 269/393, time cost: 86.39 ms | frame 270/393, time cost: 103.34 ms | frame 271/393, time cost: 83.70 ms | frame 272/393, time cost: 83.90 ms | frame 273/393, time cost: 105.14 ms | frame 274/393, time cost: 85.88 ms | frame 275/393, time cost: 86.87 ms | frame 276/393, time cost: 85.35 ms | frame 277/393, time cost: 82.88 ms | frame 278/393, time cost: 87.15 ms | frame 279/393, time cost: 85.98 ms | frame 280/393, time cost: 82.58 ms | frame 281/393, time cost: 87.00 ms | frame 282/393, time cost: 82.81 ms | frame 283/393, time cost: 82.74 ms | frame 284/393, time cost: 86.37 ms | frame 285/393, time cost: 95.18 ms | frame 286/393, time cost: 83.97 ms | frame 287/393, time cost: 82.94 ms | frame 288/393, time cost: 82.88 ms | frame 289/393, time cost: 84.09 ms | frame 290/393, time cost: 84.02 ms | frame 291/393, time cost: 101.52 ms | frame 292/393, time cost: 103.50 ms | frame 293/393, time cost: 86.51 ms | frame 294/393, time cost: 86.19 ms | frame 295/393, time cost: 85.14 ms | frame 296/393, time cost: 82.62 ms | frame 297/393, time cost: 86.14 ms | frame 298/393, time cost: 85.77 ms | frame 299/393, time cost: 86.02 ms | frame 300/393, time cost: 82.63 ms | frame 301/393, time cost: 84.50 ms | frame 302/393, time cost: 102.41 ms | frame 303/393, time cost: 85.22 ms | frame 304/393, time cost: 85.69 ms | frame 305/393, time cost: 82.53 ms | frame 306/393, time cost: 87.06 ms | frame 307/393, time cost: 82.46 ms | frame 308/393, time cost: 83.03 ms | frame 309/393, time cost: 103.85 ms | frame 310/393, time cost: 86.99 ms | frame 311/393, time cost: 87.12 ms | frame 312/393, time cost: 87.09 ms | frame 313/393, time cost: 86.97 ms | frame 314/393, time cost: 104.30 ms | frame 315/393, time cost: 86.64 ms | frame 316/393, time cost: 101.64 ms | frame 317/393, time cost: 82.37 ms | frame 318/393, time cost: 82.62 ms | frame 319/393, time cost: 84.47 ms | frame 320/393, time cost: 86.54 ms | frame 321/393, time cost: 86.56 ms | frame 322/393, time cost: 82.56 ms | frame 323/393, time cost: 84.10 ms | frame 324/393, time cost: 86.73 ms | frame 325/393, time cost: 84.09 ms | frame 326/393, time cost: 103.51 ms | frame 327/393, time cost: 82.59 ms | frame 328/393, time cost: 82.93 ms | frame 329/393, time cost: 86.92 ms | frame 330/393, time cost: 87.11 ms | frame 331/393, time cost: 83.07 ms | frame 332/393, time cost: 83.68 ms | frame 333/393, time cost: 86.67 ms | frame 334/393, time cost: 86.45 ms | frame 335/393, time cost: 86.55 ms | frame 336/393, time cost: 86.48 ms | frame 337/393, time cost: 86.51 ms | frame 338/393, time cost: 82.56 ms | frame 339/393, time cost: 87.60 ms | frame 340/393, time cost: 84.77 ms | frame 341/393, time cost: 84.02 ms | frame 342/393, time cost: 107.97 ms | frame 343/393, time cost: 82.82 ms | frame 344/393, time cost: 104.52 ms | frame 345/393, time cost: 85.87 ms | frame 346/393, time cost: 85.76 ms | frame 347/393, time cost: 85.78 ms | frame 348/393, time cost: 85.45 ms | frame 349/393, time cost: 82.70 ms | frame 350/393, time cost: 84.07 ms | frame 351/393, time cost: 85.82 ms | frame 352/393, time cost: 83.72 ms | frame 353/393, time cost: 102.32 ms | frame 354/393, time cost: 85.59 ms | frame 355/393, time cost: 102.47 ms | frame 356/393, time cost: 83.20 ms | frame 357/393, time cost: 83.91 ms | frame 358/393, time cost: 86.46 ms | frame 359/393, time cost: 82.64 ms | frame 360/393, time cost: 82.78 ms | frame 361/393, time cost: 86.05 ms | frame 362/393, time cost: 105.16 ms | frame 363/393, time cost: 86.42 ms | frame 364/393, time cost: 106.43 ms | frame 365/393, time cost: 85.26 ms | frame 366/393, time cost: 86.21 ms | frame 367/393, time cost: 84.15 ms | frame 368/393, time cost: 82.99 ms | frame 369/393, time cost: 86.29 ms | frame 370/393, time cost: 83.97 ms | frame 371/393, time cost: 90.60 ms | frame 372/393, time cost: 82.68 ms | frame 373/393, time cost: 84.05 ms | frame 374/393, time cost: 87.16 ms | frame 375/393, time cost: 87.22 ms | frame 376/393, time cost: 87.17 ms | frame 377/393, time cost: 83.50 ms | frame 378/393, time cost: 82.67 ms | frame 379/393, time cost: 82.69 ms | frame 380/393, time cost: 82.70 ms | frame 381/393, time cost: 82.77 ms | frame 382/393, time cost: 82.83 ms | frame 383/393, time cost: 86.76 ms | frame 384/393, time cost: 83.58 ms | frame 385/393, time cost: 82.54 ms | frame 386/393, time cost: 86.51 ms | frame 387/393, time cost: 82.47 ms | frame 388/393, time cost: 86.10 ms | frame 389/393, time cost: 86.26 ms | frame 390/393, time cost: 82.42 ms | frame 391/393, time cost: 87.11 ms | frame 392/393, time cost: 83.77 ms | frame 393/393, time cost: 83.29 ms

###########################################################################

| RTFNet: RTFNet_152 test results (with batch size 1) on 2019-07-29 using TITAN Xp:

| the tested dataset name: test | the tested image count: 393 | the tested image size: 480640 | recall per class: unlabeled: 0.991187, car: 0.929942, person: 0.792736, bike: 0.768230, curve: 0.607104, car_stop: 0.385262, guardrail: 0.000000, color_cone: 0.454696, bump: 0.746943 | iou per class: unlabeled: 0.979986, car: 0.874094, person: 0.702999, bike: 0.627392, curve: 0.453322, car_stop: 0.297951, guardrail: 0.000000, color_cone: 0.290689, bump: 0.557084

| average values (np.mean(x)): recall: 0.630678, iou: 0.531502 | average values (np.mean(np.nan_to_num(x))): recall: 0.630678, iou: 0.531502

| * the average time cost per frame (with batch size 1): 87.42 ms, namely, the inference speed is 11.44 fps

Although the GPU I use isn't GTX 1080 Ti, TITAN Xp should give a compatible result. The speed is quite far away from the result described in your paper as above shows.

yuxiangsun commented 5 years ago

The training code is heavily copied from MFNet, but I set the drop_last in the train dataloder as False. The ResNet weight was not freezed. Dropout was not used. As the validation loss is very noisy, it is quite a good luck to select a better final weight. In addition, the random weight initialization influences the training.

yuxiangsun commented 5 years ago

For the run time calculation method, I think that it is fine as long as all the networks are evaluated in the same manner. Here, I just give a reference. You can speed up it using a number of tricks.

yuxiangsun commented 5 years ago

The weight file is correct. You forget to change to RTFNet-50 in RTFNet.py.

celynw commented 5 years ago

Thanks for responding.

Going back to the original reason I opened the issue: Can we leave this open until there is a training script which allows people to reproduce the published results? Or at least one that gets somewhere close, I understand there is an element of randomness and selection but I only expect this to yield a small improvement.

lvcat commented 5 years ago

Thanks for responding.

Going back to the original reason I opened the issue: Can we leave this open until there is a training script which allows people to reproduce the published results? Or at least one that gets somewhere close, I understand there is an element of randomness and selection but I only expect this to yield a small improvement.

recently i retrain this network use mfnet‘s training set i got result as follow:

class acc avg | class iou | car(acc iou) | person | bike | curve | carstop | Guardrail | colorcone | bump | 50.3 43.5 | 91.1 84.3 | 76.4 66.1 | 67.4 55.8 | 60.1 42.4 | 38.7 30.1 | 1.10  0.6 | 4.11 3.11 | 14.5 11.4 | i cant get the same result too. maybe i miss something ,but i think coz the lack of detail of small objects(32 downsample) ,the results is hard to get the same data in paper. if i use the structrure as unet in RTFNet-50 i could get the close results.

lvcat commented 5 years ago

Thanks for responding. Going back to the original reason I opened the issue: Can we leave this open until there is a training script which allows people to reproduce the published results? Or at least one that gets somewhere close, I understand there is an element of randomness and selection but I only expect this to yield a small improvement.

recently i retrain this network use mfnet‘s training set i got result as follow:

class acc avg | class iou | car(acc iou) | person | bike | curve | carstop | Guardrail | colorcone | bump | 50.3 43.5 | 91.1 84.3 | 76.4 66.1 | 67.4 55.8 | 60.1 42.4 | 38.7 30.1 | 1.10  0.6 | 4.11 3.11 | 14.5 11.4 | i cant get the same result too. maybe i miss something ,but i think coz the lack of detail of small objects(32 downsample) ,the results is hard to get the same data in paper. if i use the structrure as unet in RTFNet-50 i could get the close results.

i think i could get close results in rtf-152. i will retry the rtf-50

yuxiangsun commented 5 years ago

Thanks for responding. Going back to the original reason I opened the issue: Can we leave this open until there is a training script which allows people to reproduce the published results? Or at least one that gets somewhere close, I understand there is an element of randomness and selection but I only expect this to yield a small improvement.

recently i retrain this network use mfnet‘s training set i got result as follow: class acc avg | class iou | car(acc iou) | person | bike | curve | carstop | Guardrail | colorcone | bump | 50.3 43.5 | 91.1 84.3 | 76.4 66.1 | 67.4 55.8 | 60.1 42.4 | 38.7 30.1 | 1.10  0.6 | 4.11 3.11 | 14.5 11.4 | i cant get the same result too. maybe i miss something ,but i think coz the lack of detail of small objects(32 downsample) ,the results is hard to get the same data in paper. if i use the structrure as unet in RTFNet-50 i could get the close results.

i think i could get close results in rtf-152. i will retry the rtf-50

Thank you. You may want to set the batch size of RTFNet-50 to 3 to fit your graphic memories.

lvcat commented 5 years ago

Thanks for responding. Going back to the original reason I opened the issue: Can we leave this open until there is a training script which allows people to reproduce the published results? Or at least one that gets somewhere close, I understand there is an element of randomness and selection but I only expect this to yield a small improvement.

recently i retrain this network use mfnet‘s training set i got result as follow: class acc avg | class iou | car(acc iou) | person | bike | curve | carstop | Guardrail | colorcone | bump | 50.3 43.5 | 91.1 84.3 | 76.4 66.1 | 67.4 55.8 | 60.1 42.4 | 38.7 30.1 | 1.10  0.6 | 4.11 3.11 | 14.5 11.4 | i cant get the same result too. maybe i miss something ,but i think coz the lack of detail of small objects(32 downsample) ,the results is hard to get the same data in paper. if i use the structrure as unet in RTFNet-50 i could get the close results.

i think i could get close results in rtf-152. i will retry the rtf-50

Thank you. You may want to set the batch size of RTFNet-50 to 3 to fit your graphic memories.

there is a weird thing that i could get better results with original resnet which is acc 62.2 and iou 53.8. But if i choose rtfnet-152 its only acc 53.5 and iou 45.8. i dont know the reason. = =

yuxiangsun commented 5 years ago

Thanks for responding. Going back to the original reason I opened the issue: Can we leave this open until there is a training script which allows people to reproduce the published results? Or at least one that gets somewhere close, I understand there is an element of randomness and selection but I only expect this to yield a small improvement.

recently i retrain this network use mfnet‘s training set i got result as follow: class acc avg | class iou | car(acc iou) | person | bike | curve | carstop | Guardrail | colorcone | bump | 50.3 43.5 | 91.1 84.3 | 76.4 66.1 | 67.4 55.8 | 60.1 42.4 | 38.7 30.1 | 1.10  0.6 | 4.11 3.11 | 14.5 11.4 | i cant get the same result too. maybe i miss something ,but i think coz the lack of detail of small objects(32 downsample) ,the results is hard to get the same data in paper. if i use the structrure as unet in RTFNet-50 i could get the close results.

i think i could get close results in rtf-152. i will retry the rtf-50

Thank you. You may want to set the batch size of RTFNet-50 to 3 to fit your graphic memories.

there is a weird thing that i could get better results with original resnet which is acc 62.2 and iou 53.8. But if i choose rtfnet-152 its only acc 53.5 and iou 45.8. i dont know the reason. = =

what do you mean by original resnet?

lvcat commented 5 years ago

Thanks for responding. Going back to the original reason I opened the issue: Can we leave this open until there is a training script which allows people to reproduce the published results? Or at least one that gets somewhere close, I understand there is an element of randomness and selection but I only expect this to yield a small improvement.

recently i retrain this network use mfnet‘s training set i got result as follow: class acc avg | class iou | car(acc iou) | person | bike | curve | carstop | Guardrail | colorcone | bump | 50.3 43.5 | 91.1 84.3 | 76.4 66.1 | 67.4 55.8 | 60.1 42.4 | 38.7 30.1 | 1.10  0.6 | 4.11 3.11 | 14.5 11.4 | i cant get the same result too. maybe i miss something ,but i think coz the lack of detail of small objects(32 downsample) ,the results is hard to get the same data in paper. if i use the structrure as unet in RTFNet-50 i could get the close results.

i think i could get close results in rtf-152. i will retry the rtf-50

Thank you. You may want to set the batch size of RTFNet-50 to 3 to fit your graphic memories.

there is a weird thing that i could get better results with original resnet which is acc 62.2 and iou 53.8. But if i choose rtfnet-152 its only acc 53.5 and iou 45.8. i dont know the reason. = =

what do you mean by original resnet? 就是pytorch 提供的那个把全连接层去了,然后解码器就是3x3卷积加上采样= =