Open 99sun99 opened 1 year ago
why I can only replicate the whole model on different gpu, not the model parallel that distributed part of the model to the gpu?
I follow the instruction on the webpage
why I can only replicate the whole model on different gpu, not the model parallel that distributed part of the model to the gpu?
I follow the instruction on the webpage