Open Lucarqi opened 2 weeks ago
Looking at your results I can spot a couple issues with your setting:
@GaParmar thanks for your reply. For 1: At the beginning, I used the default settings, including clip_sim loss, but it also had the same effect. For 2: I will try your advise.
I want to use pix2pix for a medical image generation task, where the control condition is different label images. My hyperparameter settings are as follows: accelerate launch src/train_pix2pix_turbo.py \ --pretrained_model_name_or_path="stabilityai/sd-turbo" \ --output_dir="output/pix2pix_turbo/test" \ --dataset_folder="data/data_u3/pix2pix" \ --resolution=256 \ --num_training_epochs=500 \ --train_batch_size=2 \ --enable_xformers_memory_efficient_attention \ --viz_freq 25 \ --track_val_fid \ --report_to "wandb" \ --tracker_project_name "test"
accelerate config is: compute_environment: LOCAL_MACHINE debug: false distributed_type: MULTI_GPU downcast_bf16: 'no' enable_cpu_affinity: false gpu_ids: all machine_rank: 0 main_training_function: main mixed_precision: 'no' num_machines: 1 num_processes: 3 rdzv_backend: static same_network: true tpu_env: [] tpu_use_cluster: false tpu_use_sudo: false use_cpu: false
I found that during training, the output images from the model are always the same and significantly different from the real images.:
And during testing, all synthetic images are also almost identical:
training wandb:
validation wandb:
This issue has been consistently troubling me. Does anyone have any unique insights about it?