PetervanLunteren / EcoAssist

Simplify camera trap image analysis with ML species recognition models based around the MegaDetector model
MIT License
121 stars 16 forks source link

`CUDA out of memory` when checking for maximum batch size #27

Closed SimonKravis closed 1 year ago

SimonKravis commented 1 year ago

The message below appear when building a custom classifier on a GPU equipped Windows machine, but classifier appears to build anyway

Params      GFLOPs  GPU_mem (GB)  forward (ms) backward (ms)                   input                  output

140045044 835.1 2.233 125.5 52.49 (1, 3, 1280, 1280) list

140045044 1670 3.288 170.6 80.59 (2, 3, 1280, 1280) list

140045044 3340 5.677 355.8 130.8 (4, 3, 1280, 1280) list

140045044 6681 9.836 5323 2526 (8, 3, 1280, 1280) list

CUDA out of memory. Tried to allocate 126.00 MiB (GPU 0; 8.00 GiB total capacity; 13.08 GiB already allocated; 0 bytes free; 13.51 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

PetervanLunteren commented 1 year ago

This is standard practise when YOLOv5 is checking the maximum batch size. It is running tests with increasing batch size, and when it hits CUDA out of memory, it knows the maximum your hardware allows.