When running a DDP script, the,what's the influence about if_split_encoder_gpus on the training process

ShChen233 commented 6 months ago

Guhanxue commented 6 months ago

Hi, this args is used bc Vit-H is generally huge compared with our lab's available gpu A6000 (hahaha). Thus if the Image Encoder can not fit into one GPU, you could do model parallel to split some blocks into 1 gpu, and other blocks to another.

For example, if i put args.if_split_encdoer_gpus = True, and args.gpu_fractions = [0.5,0.5], args.devices = [0,1]; We are training Vit-H, if means we are going to put half of the image encoder on (16 transformer blocks) on gpu 0, and the remaining on gpu 1.

It only needed if you have a small gpu (cry cry) but you need to train a large model.

Please let me know if you have further questions!

ShChen233 commented 6 months ago

Thanks~

mazurowski-lab / finetune-SAM

When running a DDP script, the,what's the influence about if_split_encoder_gpus on the training process #2