google-research / semivl

[ECCV'24] Official Implementation of SemiVL: Semi-Supervised Semantic Segmentation with Vision-Language Guidance
Apache License 2.0
111 stars 9 forks source link

Pre-calculate Pseudo-labels for the CLIP Guidance Loss #20

Closed wangme88 closed 2 months ago

wangme88 commented 2 months ago

Hi, I’m encountering a similar CUDA out-of-memory (OOM) issue as described in #10 while running on TITAN XP GPUs to run the pascal voc experiment with 92 labels. It occurred during the ASPP stage. To troubleshoot, I reduced the batch size to 1 and scaled down the decoder channels. Specifically, I made the following adjustments to the deocoder header:

Due to my limited VRAM, I would like to try out pre-calculating pseudo-labels for the CLIP guidance loss as suggested in the other thread. My question is how to do it? I’ve identified the forward_maskclip function in model/vlm.py as a potential candidate, but it appears to process weakly augmented images, which vary per iteration. I’m unsure how best to handle this variability when pre-calculating the labels.

Thanks for your help!

wangme88 commented 2 months ago

Was able to conduct the experiment with 1/2 channel size and 2 a6000s. Closing this thread for now.