kohya-ss / sd-scripts

Apache License 2.0
4.96k stars 832 forks source link

SD3 Training Error #1465

Open SteVoit opened 1 month ago

SteVoit commented 1 month ago

When i try to train with a new installation of kohya_ss with sd3-flux.1 branch (checked out today) i always end up with this error:

Clipboard_08-16-2024_01

File "/home/user/Desktop/kohyass/sd-scripts/library/sd3_models.py", line 1591, in forward tokens = torch.LongTensor(tokens).to(device) TypeError: expected TensorOptions(dtype=long int, device=cpu, layout=Strided, requires_grad=false (default), pinned_memory=false (default), memory_format=(nullopt)) (got TensorOptions(dtype=long int, device=cuda:0, layout=Strided, requires_grad=false (default), pinned_memory=false (default), memory_format=(nullopt)))

can anyone explain what im doing wrong?

SteVoit commented 1 month ago

here is my config: adaptive_noise_scale = 0 bucket_no_upscale = true bucket_reso_steps = 64 cache_latents = true caption_dropout_every_n_epochs = 0 caption_dropout_rate = 0 caption_extension = ".txt" clip_g = "/home/user/Downloads/clip_g.safetensors" clip_l = "/home/user/Downloads/clip_l.safetensors" clip_skip = 1 dynamo_backend = "no" enable_bucket = true epoch = 100 full_bf16 = true gradient_accumulation_steps = 1 huber_c = 0.1 huber_schedule = "snr" keep_tokens = 0 learning_rate = 2e-7 learning_rate_te = 2e-7 logging_dir = "/home/user/Desktop/training/logs" logit_mean = 0 logit_std = 1 loss_type = "l2" lr_scheduler = "cosine" lr_scheduler_args = [] lr_scheduler_num_cycles = 1 lr_scheduler_power = 1 lr_warmup_steps = 17200 max_bucket_reso = 2048 max_data_loader_n_workers = 0 max_timestep = 1000 max_token_length = 225 max_train_steps = 172000 min_bucket_reso = 256 mixed_precision = "bf16" mode_scale = 1.29 multires_noise_discount = 0.3 multires_noise_iterations = 0 noise_offset = 0 noise_offset_type = "Original" optimizer_type = "AdamW" output_dir = "/home/user/Desktop/training/model" output_name = "cats_sd3" persistent_data_loader_workers = 0 pretrained_model_name_or_path = "/home/user/Downloads/sd3_medium.safetensors" prior_loss_weight = 1 resolution = "1024,1024" sample_prompts = "/home/user/Desktop/training/model/sample/prompt.txt" sample_sampler = "euler_a" save_clip = true save_every_n_epochs = 5 save_model_as = "safetensors" save_precision = "fp16" save_t5xxl = true sdpa = true shuffle_caption = true t5xxl = "/home/user/Downloads/t5xxl_fp16.safetensors" t5xxl_dtype = "fp16" text_encoder_batch_size = 1 train_batch_size = 1 train_data_dir = "/home/user/Desktop/training/images" wandb_run_name = "cats_sd3" weighting_scheme = "logit_normal"

kohya-ss commented 1 month ago

This issue was occured when --cache_text_encoder_outputs is not specified. I fixed it.

Please note that training without --cache_text_encoder_outputs requires >30GB VRAM. If you want to train < 24GB VRAM, please add the option.