Open sladenheim opened 4 months ago
To allow for the training of larger models split across GPUS using the DefaultTrainer class, implement model parallel capabilities using the torch.DistributedDataParallel functionality.
Thanks for opening this @sladenheim. We are currently focusing on a new release of micro_sam, but will then look into this.
To allow for the training of larger models split across GPUS using the DefaultTrainer class, implement model parallel capabilities using the torch.DistributedDataParallel functionality.