kohya-ss / sd-scripts

Apache License 2.0
5.31k stars 880 forks source link

Blurry images by LORA schnell #1687

Open JEFFSEVENTHSENSE opened 1 month ago

JEFFSEVENTHSENSE commented 1 month ago

!/bin/bash

CUDA_VISIBLE_DEVICES=7 accelerate launch \ --mixed_precision bf16 \ --num_cpu_threads_per_process 1 \ flux_train_network.py \ --pretrained_model_name_or_path flux1-schnell.safetensors \ --clip_l /home/dluser/development/Jeff/sd-scripts/sd3/clip_l.safetensors \ --t5xxl /home/dluser/development/Jeff/sd-scripts/sd3/t5xxl_fp16.safetensors \ --ae ae.safetensors \ --save_model_as safetensors \ --sdpa \ --persistent_data_loader_workers \ --max_data_loader_n_workers 2 \ --seed 42 \ --gradient_checkpointing \ --mixed_precision bf16 \ --save_precision bf16 \ --network_module networks.lora_flux \ --network_dim 4 \ --optimizer_type Adafactor \ --learning_rate 1e-4 \ --highvram \ --max_train_epochs 16 \ --save_every_n_epochs 4 \ --dataset_config dataset_1024_bs2.toml \ --output_dir /home/dluser/development/Jeff/LoRA \ --output_name flux-lora-jeff \ --timestep_sampling shift \ --discrete_flow_shift 3.1582 \ --model_prediction_type raw \ --guidance_scale 1.0 \ --network_train_unet_only \ --cache_text_encoder_outputs \ --cache_text_encoder_outputs_to_disk

this is my training script above

then the data set setting is as followed

[general] shuffle_caption = false caption_extension = '.txt' keep_tokens = 1

[[datasets]] resolution = 512 batch_size = 1 keep_tokens = 1

[[datasets.subsets]] image_dir = '/home/dluser/development/Jeff/jeff' class_tokens = 'J3FF' num_repeats = 10

total size of 10 images , trained for 2.5k training steps on an A100 for 30minutes. LORA size is about 40MB. used the lora for inference as per flux_minimal_inference.py for schnell then the results are

J3FF sunset lighting portrait with golden hues illuminating his face 1 8

J3FF well-lit portrait highlighting features in soft natural light
1 6

why are the inferenced images blurry even at a resolution of 512 by 512.

sdbds commented 1 month ago

schnell doesn't use shift, so you don't need to select the timestep_sampling shift