med-air / 3DSAM-adapter

Holistic Adaptation of SAM from 2D to 3D for Promptable Medical Image Segmentation
134 stars 12 forks source link

Out of Memory #2

Open zutsusemi opened 1 year ago

zutsusemi commented 1 year ago

I ran the sample on kits but it requires a lot of memory, way more than 3090's 24G. I wonder how much memory we need to run this code?

peterant330 commented 1 year ago

I ran the sample on kits but it requires a lot of memory, way more than 3090's 24G. I wonder how much memory we need to run this code?

Hi, The code requires about 35G memory. We tested with 2 3090 or 1 A40.

zutsusemi commented 1 year ago

So how do you "connect" the 2 3090? Or did you somehow "split" the model into 2 parts?

peterant330 commented 1 year ago

I split the model into 2 parts and assign each parts to different GPUs

So how do you "connect" the 2 3090? Or did you somehow "split" the model into 2 parts?

Asagami-Fujino commented 11 months ago

I tried to run the encoder only on a 3090. The size of the input is 160x160x160. I didn't interpolate the input into 512x512x512 but changed the patch_size to 5x5x5 (So the number of patches is just the same). But it turned out 'out of memory'. I'm wondering do you split the model in this way, like the encoder part and the others? or is there some other "split" methods?

peterant330 commented 11 months ago

I tried to run the encoder only on a 3090. The size of the input is 160x160x160. I didn't interpolate the input into 512x512x512 but changed the patch_size to 5x5x5 (So the number of patches is just the same). But it turned out 'out of memory'. I'm wondering do you split the model in this way, like the encoder part and the others? or is there some other "split" methods?

Actually, the split is a little troublesome. As the major memory cost comes from the image encoder, I have to split the encoder into multiple gpus. The encoder is composed of multiple blocks, we put some blocks into the first gpu and the rest into the second gpu. As you can see from the code (image_encoder.py, line 156), we use two for loops to handle blocks[:6] and blocks[6:12] separately, this is actually how I split the encoder.