Open LukeAI opened 5 years ago
@LukeAI Hi,
No. In the Darknet there is no optimizations of VRAM for inference only.
So you can comment all lines with ... update ...
and ... delta ...
there https://github.com/AlexeyAB/darknet/blob/4c315ea26b56c2bf20ebc240d94386c6e3cc83db/src/convolutional_layer.c#L474-L540
and these lines: https://github.com/AlexeyAB/darknet/blob/4c315ea26b56c2bf20ebc240d94386c6e3cc83db/src/convolutional_layer.c#L529-L530
Commenting out 529, 530, 538 and 539 brought down memory usage from 2181MiB to 1853MiB (cudnnn_half) - great tip, thankyou! I got mysterious runtime errors with the update and delta lines from 474 to 540 commented out.
Is there anything else I can do other than lines 529, 530, 538 and 539 ? If I can decrease by another 22MiB per instance then I'll be able to run everything that I want simultanaously.
You can try to find lines with ... update ... and ... delta ... in other layers, in - make_maxpool_layer, make_shortcut_layer, .... and comment them
Hey @AlexeyAB I was wondering if you had any tips for decreasing the video memory footprint of darknet without decreasing the network size or changing the cfg? I have a 2080Ti and I'm using libdarknet.so wrapped in python. I note that darknet uses 1867MB video memory or over 2000MB when compiled with CUDNN_HALF
If I am only planning to run inference 1 image at a time with yolov3-spp.cfg - I wonder if there are redundant bits of the source that I can trim that are perhaps occupying a bit of VRAM? I realise that darknet is very tightly coded and that there probably aren't but I thought I'd ask....