Qsingle / LearnablePromptSAM

Try to use the SAM-ViT as the backbone to create the learnable prompt for semantic segmentation
Apache License 2.0
77 stars 13 forks source link

GPU memory requirement #13

Open Raspberry-beans opened 7 months ago

Raspberry-beans commented 7 months ago

Thanks for the great work!

I will be replicating your approach to fine tune SAM on custom few shot medical images. The problem is I have only 8GB GPU memory available from my university.

Would it be possible to replicate your approach having only 8GB memory space. If not, is there any way I can do fine tuning requiring less than 8GB memory e.g using batch_size of 1, reducing size of prompt layer or using SAM Vit-B image encoder etc

Your suggestions will be appreciated.

Regards, Muhammad

Qsingle commented 7 months ago

Thank you. You can try the ViT-B as the backbone and set the batch size to 1. If the memory is not enough. You can also try to downsample the input and resize the position embedding. - You can follow our anohter repository.

Raspberry-beans commented 7 months ago

Thanks a lot for your response. I am also thinking to first fine tune mask-decoder only (keeping image and prompt encoder frozen) for few custom images. Would you think this would be possible in my case as SAM mentioned that their mask decoder is lightweight?

Have a nice dary!

Qsingle commented 7 months ago

Yeah, the decoder is lightweight.