Open HYH-M87 opened 2 months ago
Hello, the "out of memory" issue you encountered is primarily due to the upsampling function in Mask2Former. Initially, the function attempts to run the upsampling on the GPU, but if it encounters an out-of-memory (OOM) error, it automatically switches to CPU processing by calling retry_if_cuda_oom
. For more specific details, please refer to the forward function in whole_image_segmentation/mask2former/maskformer_model.py
.
To fully address your question and clarify the source of this error, let's discuss how Mask2Former processes mask predictions. The model first generates low-resolution masks and then uses the F.interpolate()
function to upsample these masks to match the size of the original image. Given that our model outputs several hundred masks per image and the SA1B images are of very high resolution, the F.interpolate()
function requires significant memory capacity on either the GPU or CPU.
Here are a few strategies to fix this memory issue:
TEST.DETECTIONS_PER_IMAGE
.I'm here to help if you have more questions or need further clarification!
The prompt demo has some bug relative to device which I haven't checked. But I could successfully run the whole image demo by change the batch size to 1 and use a 3090 (24G) + ~70G or more CPU RAM
Problems: When I followed the instructions in the README document under "Inference Demo for UnSAM with Pre-trained Models (whole image segmentation)" and tried to run demo_whole_image.py, the following error occurred:
[07/09 16:09:19 d2.utils.memory]: Attempting to copy inputs of <func to CPU due to CUDA OOM Killed
Then, I keep checking the GPU and memory usage, and the results are as follows: It can be observed that the GPU memory is not fully utilized, but the CPU usage and memory usage are very high. After this, the program crashed.I tried reducing the input image size by half, and the program ran normally, but the GPU memory was still fully utilized, and the CUDA OOM problem still occurred.
My questions: Is this normal? Does it need to use such a large amount of GPU memory?
The complete log output is as follows:
systeminfo