SD3 Training on Multi GPU

Hi there so SD Training on 1 GPU Works just fine but as soon as i enable multi gpu with 2 GPUs i get this error:

Clipboard_08-18-2024_01

im on Ubuntu 22.04 is there any special stuff required if i want to train sd3 on multi gpu?

also here is my config:

adaptive_noise_scale = 0 bucket_no_upscale = true bucket_reso_steps = 64 cache_latents = true cache_text_encoder_outputs = true caption_dropout_every_n_epochs = 0 caption_dropout_rate = 0 caption_extension = ".txt" clip_g = "/home/user/Downloads/clip_g.safetensors" clip_l = "/home/user/Downloads/clip_l.safetensors" clip_skip = 1 dynamo_backend = "no" enable_bucket = true epoch = 100 full_bf16 = true gradient_accumulation_steps = 1 huber_c = 0.1 huber_schedule = "snr" keep_tokens = 0 learning_rate = 2e-7 learning_rate_te = 2e-7 logging_dir = "/home/user/Desktop/training/logs" logit_mean = 0 logit_std = 1 loss_type = "l2" lr_scheduler = "cosine" lr_scheduler_args = [] lr_scheduler_num_cycles = 1 lr_scheduler_power = 1 lr_warmup_steps = 17200 max_bucket_reso = 2048 max_data_loader_n_workers = 0 max_timestep = 1000 max_token_length = 225 max_train_steps = 172000 min_bucket_reso = 256 mixed_precision = "bf16" mode_scale = 1.29 multires_noise_discount = 0.3 multires_noise_iterations = 0 noise_offset = 0 noise_offset_type = "Original" optimizer_type = "AdamW" output_dir = "/home/user/Desktop/training/model" output_name = "cats_sd3" persistent_data_loader_workers = 0 pretrained_model_name_or_path = "/home/user/Downloads/sd3_medium.safetensors" prior_loss_weight = 1 resolution = "1024,1024" sample_every_n_epochs = 10 sample_prompts = "/home/user/Desktop/training/model/sample/prompt.txt" sample_sampler = "euler_a" save_clip = true save_every_n_epochs = 5 save_model_as = "safetensors" save_precision = "fp16" save_t5xxl = true sdpa = true t5xxl = "/home/user/Downloads/t5xxl_fp16.safetensors" t5xxl_dtype = "fp16" text_encoder_batch_size = 1 train_batch_size = 1 train_data_dir = "/home/user/Desktop/training/images" wandb_run_name = "cats_sd3" weighting_scheme = "logit_normal"

kohya-ss / sd-scripts

SD3 Training on Multi GPU #1473