Open avanetten opened 4 years ago
HI!
I'm getting the same error: RuntimeError: CUDA out of memory. Tried to allocate 1.56 GiB (GPU 0; 14.73 GiB total capacity; 12.90 GiB already allocated; 947.88 MiB free; 12.92 GiB reserved in total by PyTorch)
I wonder is the inference and training work independently, because otherwise, I won't bother start training since I'd be getting the same error
Could you share your yml file ?
Summary of the bug
When running inference on images larger than ~1200x1200, CUDA often runs out of memory. This looks to be because the tiler puts all subwindows of a large image into a single batch (https://github.com/CosmiQ/solaris/blob/master/solaris/nets/infer.py#L75). This large batch can then be too large to fit into memory.
Steps to reproduce the bug
Buggy behavior and/or error message
Expected behavior
Inference should run smoothly on large images