Efficient LPS CUDA out of memory issues v.s. low PQ/SQ/RQ from evaluation with modified config

CamRW commented 1 year ago

Hello, thanks again for all your hard work on building this toolkit.

I apologize if I'm missing something obvious as I'm still pretty new to this :)

I'm trying to get the Efficient_LPS example_usage.py to run on a relatively low memory GPU (4 GB) purely for evaluation/inference with pretrained weights. Is this possible? I know that if I change the config parameter img_scale in singlegpu_semantickitti.py I can get it to fit on the gpu. E.g.

test_pipeline = [
    dict(type='LoadLidarFromFile', project=True, H=64, W=2048, fov_up=3.0, fov_down=-25.0, gt=True, max_points=150000,
         sensor_img_means=[12.12, 10.88, 0.23, -1.04, 0.21], sensor_img_stds=[12.32, 11.47, 6.91, 0.86, 0.16]),
    dict(type='Resize', img_scale=(1024, 64), multiscale_mode='value', keep_ratio=False),
    dict(type='DefaultFormatBundle'),
    dict(type='Collect', keys=['img']),
]

img_scale is normally (4096, 256) and when I try to run the example_usage with the original img_scale, I get the CUDA out of memory error:

"RuntimeError: CUDA out of memory. Tried to allocate xxx MiB (GPU X; Y MiB total capacity; Z MiB already allocated; A MiB free; B MiB cached)"

I assume this parameter affects performance of the model in some way as I get the following evaluation statistics when running it on semantic kitti eval with img_scale=(1024, 64)

Category      |    PQ    SQ    RQ   IoU
-----------------------------------------
all           |   4.6  24.7   7.3   0.1
things        |   0.4  22.3   0.7   0.0
stuff         |   7.6  26.4  12.1   0.1
-----------------------------------------
unlabeled     |   0.0   0.0   0.0   0.0
car           |   2.8  58.6   4.7   0.1
bicycle       |   0.0   0.0   0.0   0.0
motorcycle    |   0.0   0.0   0.0   0.0
truck         |   0.0   0.0   0.0   0.0
other-vehicle |   0.0  55.2   0.1   0.0
person        |   0.5  64.8   0.7   0.0
bicyclist     |   0.0   0.0   0.0   0.0
motorcyclist  |   0.0   0.0   0.0   0.0
road          |  40.3  62.7  64.4   0.5
parking       |   0.0   0.0   0.0   0.0
sidewalk      |   0.0  50.5   0.0   0.0
other-ground  |   0.0   0.0   0.0   0.0
building      |   0.0   0.0   0.0   0.0
fence         |   0.1  51.8   0.2   0.0
vegetation    |  31.5  63.2  49.9   0.5
trunk         |   0.0   0.0   0.0   0.0
terrain       |  11.3  61.6  18.4   0.3
pole          |   0.0   0.0   0.0   0.0
traffic-sign  |   0.0   0.0   0.0   0.0

Is it possible to run this model on a relatively low memory gpu? Or, is it possible to run it on the CPU? I'd like to see if I can replicate the statistics from the original paper. Any help is greatly appreciated!

Sincerely, Cameron Weigel

vniclas commented 1 year ago

Hi @CamRW, I'm very sorry for the delayed response. Could you please kindly tell me whether your question is still applicable?

Generally speaking, the model is quite memory-intense, i.e., I doubt that you'll be able to fit it on a 4GB GPU. Running it on the CPU is unfortunately not possible due to the utilized base code.

CamRW commented 1 year ago

Hi @vniclas,

No worries! I figured as much, I wanted to ask just in case there's a solution I'm not aware of. I'd say this question is no longer applicable. I'll test it on some more appropriate hardware and see if I can get the same results. Thanks for getting back to me! Feel free to close/delete/etc this issue.

opendr-eu / opendr

Efficient LPS CUDA out of memory issues v.s. low PQ/SQ/RQ from evaluation with modified config #452