Picsart-AI-Research / SeMask-Segmentation

[NIVT Workshop @ ICCV 2023] SeMask: Semantically Masked Transformers for Semantic Segmentation
https://arxiv.org/abs/2112.12782
Other
251 stars 36 forks source link

question : out of memory and mask prediction #13

Closed an99990 closed 2 years ago

an99990 commented 2 years ago

Hi, So ive been trying a lot of the variants here and I have some questions. I am able to compile Mask2Former with swin large as backbone with 8GB RAM but not able to compile SeMask-FPN with the tiny swin backbone. The number of parameters is really different, is there an explanation ?

Also, I was trying to visualise the mask output in the prediction, but i only get black screen. The predictions output is a dict [sem_seg[X, img.width,img.lenght]] , does this mean that there X number of masks ? and the output visualise is like a sum of all those masks ?

praeclarumjj3 commented 2 years ago

Are you sure you can run a Swin-L Mask2Former but not a SeMask-T FPN? The number of parameters will obviously be different since the decoders are different.

X denotes the number of classes -> number of binary masks. We take an argmax along the channel dimension for output to get the final one-channel prediction. If you don't understand this, I suggest you read about how we run inference for semantic segmentation.

For which setting do you get the black screen? It would be helpful if you explained the complete model+dataset setting.

an99990 commented 2 years ago

so using these configs works using demo.py:

/SeMask_Segmentation/SeMask_Mask2Former/configs/ade20k/semantic-segmentation/semask_swin/maskformer2_semask_swin_large_IN21k_384_bs16_160k_res640.yaml",
"MODEL.WEIGHTS",
f"{root_dir}/models_weights/semask_large_mask2former_ade20k.pth"]

but this one using image_demo.py doesn't work

SeMask_Segmentation/SeMask_FPN/configs/semask_swin/coco_stuff10k/semfpn_semask_swin_tiny_patch4_window7_512x512_80k_coco10k.py",
f"{img_dir}/images/person_bike.jpg",
f"{model_dir}/models_weights/semask_tiny_fpn_coco10k.pth"]
praeclarumjj3 commented 2 years ago

What's the error? About the black prediction, did you update the --palette argument for demo.py?

an99990 commented 2 years ago

Hi so no actual errors, i just tried to display predictions["sem_seg"][0] to see what it looks like

praeclarumjj3 commented 2 years ago

What did you use to display? Anyways, you need to specify the --palette for using the correct dataset config.

https://github.com/Picsart-AI-Research/SeMask-Segmentation/blob/19903ffb09052000e884749bc2c6dee382c63a20/SeMask-FPN/demo/demo.py#L16

https://github.com/Picsart-AI-Research/SeMask-Segmentation/blob/19903ffb09052000e884749bc2c6dee382c63a20/SeMask-FPN/demo/demo.py#L31

The above line already opens up a window to display the results. If you want instead save the results, replace the imshow with savefig on #L117: https://github.com/Picsart-AI-Research/SeMask-Segmentation/blob/19903ffb09052000e884749bc2c6dee382c63a20/SeMask-FPN/mmseg/apis/inference.py#L115-L117.

an99990 commented 2 years ago

yes thank you, Do you have an idea on why I can compile maskformer2_semask_swin_large_IN21k_384_bs16_160k_res640.yaml but not semfpn_semask_swin_tiny_patch4_window7_512x512_80k_coco10k

praeclarumjj3 commented 2 years ago

I am not sure. Ideally, it should work during inference for you. What's the exact log output?

an99990 commented 2 years ago

with /workspaces/halodi-segmentation/halodi_segmentation/models_weights/semask_tiny_fpn_coco10k.pth

CUDA out of memory. Tried to allocate 7.71 GiB (GPU 0; 8.00 GiB total capacity; 443.31 MiB already allocated; 5.49 GiB free; 478.00 MiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
  File "/workspaces/halodi-segmentation/halodi_segmentation/models/SeMask_Segmentation/SeMask_FPN/mmseg/ops/wrappers.py", line 29, in resize
    return F.interpolate(input, size, scale_factor, mode, align_corners)
  File "/workspaces/halodi-segmentation/halodi_segmentation/models/SeMask_Segmentation/SeMask_FPN/mmseg/models/segmentors/encoder_decoder.py", line 226, in whole_inference
    seg_logit = resize(
  File "/workspaces/halodi-segmentation/halodi_segmentation/models/SeMask_Segmentation/SeMask_FPN/mmseg/models/segmentors/encoder_decoder.py", line 257, in inference
    seg_logit = self.whole_inference(img, img_meta, rescale)
  File "/workspaces/halodi-segmentation/halodi_segmentation/models/SeMask_Segmentation/SeMask_FPN/mmseg/models/segmentors/encoder_decoder.py", line 272, in simple_test
    seg_logit = self.inference(img, img_meta, rescale)
  File "/workspaces/halodi-segmentation/halodi_segmentation/models/SeMask_Segmentation/SeMask_FPN/mmseg/models/segmentors/base.py", line 127, in forward_test
    return self.simple_test(imgs[0], img_metas[0], **kwargs)
  File "/workspaces/halodi-segmentation/halodi_segmentation/models/SeMask_Segmentation/SeMask_FPN/mmseg/models/segmentors/base.py", line 145, in forward
    return self.forward_test(img, img_metas, **kwargs)
  File "/workspaces/halodi-segmentation/halodi_segmentation/models/mmcv/mmcv/runner/fp16_utils.py", line 109, in new_func
    return old_func(*args, **kwargs)
  File "/workspaces/halodi-segmentation/halodi_segmentation/models/SeMask_Segmentation/SeMask_FPN/mmseg/apis/inference.py", line 97, in inference_segmentor
    result = model(return_loss=False, rescale=True, **data)
  File "/workspaces/halodi-segmentation/halodi_segmentation/models/SeMask_Segmentation/SeMask_FPN/demo/image_demo.py", line 38, in main
    result = inference_segmentor(model, args[1])
  File "/workspaces/halodi-segmentation/halodi_segmentation/models/SeMask_Segmentation/SeMask_FPN/demo/image_demo.py", line 48, in <module>
    main()
praeclarumjj3 commented 2 years ago

Well, it must have something to do with the ops inside mmsegmentation and mmcv. Since Mask2Former is based on detectron2, it must be more optimized w.r.t. space. You could try another configuration (with ade20k/cityscapes) to check if that works, but this involves the complexity of the underlying library operations.

How much space on GPU does the SeMask-L Mask2Former use?

an99990 commented 2 years ago

around 5GB

praeclarumjj3 commented 2 years ago

Yeah, I suppose it's about the complexity of the underlying operations. Detectron2 is more memory efficient based on your data.

an99990 commented 2 years ago

alright thank you :)