PRBonn / segcontrast

MIT License
96 stars 13 forks source link

About inference #2

Closed songw-zju closed 2 years ago

songw-zju commented 2 years ago

Thanks for your great work. It's so helpful for me. I encountered a problem(cuda out of memory) when running the code inference_vis.py for inference with a single RTX 2080TI(11GB). Then I tried to clear the gradient information during the inference phase in the following two ways:

 with torch.no_grad():
        h = model['model'](x)
        z = model['classifier'](h)

and

 h = model['model'](x)
 z = model['classifier'](h)

 z_data = z.detach().clone()
 del z, h
 torch.cuda.empty_cache()

But the precision obtained in the end was terrible. Is there something wrong with my operation?

epoch119_model_segment_contrast_0p01.pt epoch119_model_head_segment_contrast_0p01.pt
Loading model: segment_contrast_0p01, from epoch: 119
The size of validation data is 4071
[IOU EVAL] IGNORE:  tensor(0)
[IOU EVAL] INCLUDE:  tensor([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17, 18,
        19])
100%|███████████████████████████████████████| 4071/4071 [47:08<00:00,  1.44it/s]

Model Acc.: 0.6048522326095858  Model mIoU: 0.12205555582195342

- Per Class mIoU:
    unlabeled: 0.0
    car: 0.03326307758955674
    bicycle: 0.0
    motorcycle: 0.0
    truck: 0.005434706041024883
    other-vehicle: 0.0009393006041743345
    person: 2.105848362071144e-05
    bicyclist: 0.0
    motorcyclist: 0.0
    road: 0.5329202453443465
    parking: 0.00024430327277334115
    sidewalk: 0.14130937737870972
    other-ground: 0.005318916716387502
    building: 0.42063443358322344
    fence: 0.04141291136875783
    vegetation: 0.663303909076575
    trunk: 0.0005299962863970672
    terrain: 0.3674830987154452
    pole: 0.10601565141190306
    traffic-sign: 0.00022457474421981168
nuneslu commented 2 years ago

Are you using the bash script in tools/eval_train.sh? With this script the cuda error should not happen since I used a GPU with only 8Gb. Regarding the performance also should be fixed by using the bash script since there the correct parameters (for example resolution, which checkpoint to use and so on) are defined as used on the paper.

You can run bash tools/eval_train.sh and you should be able to reproduce the results.

nuneslu commented 2 years ago

You maybe should pull the repo since I have updated the documentation and the bash scripts since the last issue. :smile:

songw-zju commented 2 years ago

Thank you for your suggestion! I will retest with the updated code.

nuneslu commented 2 years ago

Ok! Let me know if the problem still occurs.

songw-zju commented 2 years ago

Hi, @nuneslu, the problem disappeared with the updated code. Thanks for your help and great work!