Akegarasu / lora-scripts

SD-Trainer. LoRA & Dreambooth training scripts & GUI use kohya-ss's trainer, for diffusion model.
GNU Affero General Public License v3.0
4.69k stars 577 forks source link

AttributeError: 'NoneType' object has no attribute 'view' when training flux-lora with loss_type = "smooth_l1" #565

Open SunWintor opened 1 month ago

SunWintor commented 1 month ago

Description:

I'm experiencing an issue when training a flux model using the lora-scripts project (version 1.9.0). The training process crashes with an AttributeError stating that 'NoneType' object has no attribute 'view' when using loss_type = "smooth_l1".

Error Message:

)QAWYD`ZFJC_I1SOB2`5MP8

Traceback (most recent call last): File "A:\funny\flux_train\Lora训练-测试版\lora-scripts-v1.9.0\scripts\dev\flux_train_network.py", line 565, in trainer.train(args) File "A:\funny\flux_train\Lora训练-测试版\lora-scripts-v1.9.0\scripts\dev\train_network.py", line 1192, in train loss = train_util.conditional_loss( File "A:\funny\flux_train\Lora训练-测试版\lora-scripts-v1.9.0\scripts\dev\library\train_util.py", line 5815, in conditional_loss huber_c = huber_c.view(-1, 1, 1, 1) AttributeError: 'NoneType' object has no attribute 'view'

Training Parameters:

Here are the training parameters I'm using:

model_train_type = "flux-lora" pretrained_model_name_or_path = "A:/funny/ConfyUI-aki/ComfyUI-aki-v1.3/models/unet/flux1-dev.sft" ae = "A:/funny/ConfyUI-aki/ComfyUI-aki-v1.3/models/vae/ae.safetensors" clip_l = "A:/funny/ConfyUI-aki/ComfyUI-aki-v1.3/models/clip/clip_l.safetensors" t5xxl = "A:/funny/ConfyUI-aki/ComfyUI-aki-v1.3/models/clip/t5xxl_fp16.safetensors" clip_g = "A:/funny/ConfyUI-aki/ComfyUI-aki-v1.3/models/clip/clip_g.safetensors" timestep_sampling = "shift" sigmoid_scale = 1 model_prediction_type = "raw" discrete_flow_shift = 3.158 loss_type = "smooth_l1" guidance_scale = 1 train_data_dir = "A:/funny/flux_train/Lora训练-测试版/lora-scripts-v1.9.0/train/model_v5" prior_loss_weight = 1 resolution = "1024,1024" enable_bucket = true min_bucket_reso = 256 max_bucket_reso = 2048 bucket_reso_steps = 64 bucket_no_upscale = true output_name = "model_v5" output_dir = "./output" save_model_as = "safetensors" save_precision = "bf16" save_every_n_epochs = 2 max_train_epochs = 30 train_batch_size = 1 gradient_checkpointing = true gradient_accumulation_steps = 1 network_train_unet_only = true network_train_text_encoder_only = false learning_rate = 0.0001 unet_lr = 1 text_encoder_lr = 1 lr_scheduler = "cosine" lr_warmup_steps = 0 lr_scheduler_num_cycles = 0 optimizer_type = "Prodigy" optimizer_args = [ "decouple=True", "weight_decay=0.01", "use_bias_correction=True", "d_coef=1" ] network_module = "networks.lora_flux" network_dim = 8 network_alpha = 8 log_with = "tensorboard" logging_dir = "./logs" caption_extension = ".txt" shuffle_caption = false weighted_captions = false keep_tokens = 0 seed = 21337 clip_skip = 1 mixed_precision = "bf16" fp8_base = true sdpa = true lowram = false cache_latents = true cache_latents_to_disk = true cache_text_encoder_outputs = true cache_text_encoder_outputs_to_disk = true persistent_data_loader_workers = true

Steps to Reproduce:

  1. Set up the training environment using the parameters specified above.
  2. Begin training .
  3. The error occurs during the training process.

Investigation and Findings:

Additional Information:

Environment:

Expected Behavior:

Actual Behavior:

Request:

Thank you for your assistance!

If you need any additional information or logs, please let me know, and I'll be happy to provide them.

bobo3313 commented 2 weeks ago

me too