About GPU usage during inference

czg1225 / SlimSAM

SlimSAM: 0.1% Data Makes Segment Anything Slim

Apache License 2.0

248 stars 14 forks source link

About GPU usage during inference #21

Open yangzijia opened 1 week ago

yangzijia commented 1 week ago

Hi, thank you for sharing. I found that after pruning, the model size of slimsam-77 is only 38M, which is the same as the model size of edgesam, but the GPU usage of slimsam is still very high, 3071MiB, while that of edgesam is only 433MiB. Why does this happen? I don't quite understand the technical principle.

czg1225 commented 1 week ago

Hi @yangzijia , SAM's main memory requirements come from the global attention block of its image encoder. After pruning, SlimSAM greatly reduces the number of parameters through channel pruning, but the number of image tokens remains unchanged. If you want to solve this problem, you can try to further apply token merging or token pruning techniques.