Open Alan-Han opened 2 weeks ago
What is the --condition_resolution
and --resolution
?
What is the
--condition_resolution
and--resolution
?
--condition_resolution=64 --resolution=256 by the way, I use LAION-Aesthetics-V2-6.5plus dataset's original text as input text
From the log screenshot it seems that you are using Stable Diffusion 1.5, right? If that’s the case, you would need to set --resolution=512
, which is the output resolution of SD1.5. Letting it output 256x256 images won’t work well.
From the log screenshot it seems that you are using Stable Diffusion 1.5, right? If that’s the case, you would need to set
--resolution=512
, which is the output resolution of SD1.5. Letting it output 256x256 images won’t work well.
Sorry, the confusion might have been caused by the naming of my experiment.I use miniSD as you did. here is the complete training script.
MAX_STEPS=10000
LR=1e-5
BS=32
PROMPT_DROPOUT=0.05
trail_name="sd15_control_tile_${BS}_${LR}_${MAX_STEPS}_dropout${PROMPT_DROPOUT}_v31"
OUTPUT_DIR="exp/controlnet_tile/${trail_name}"
MODEL_NAME="lambdalabs/miniSD-diffusers"
accelerate launch --main_process_port 12346 train_controlnet.py \
--pretrained_model_name_or_path=$MODEL_NAME \
--output_dir=$OUTPUT_DIR \
--condition_resolution=64 \
--resolution=256 \
--learning_rate=${LR} \
--max_train_steps=${MAX_STEPS} \
--max_train_samples=85000 \
--dataloader_num_workers=8 \
--train_shards_path_or_url="/mnt/train_data/laion_6.5plus_tars/laion_{000000..000100..100}.tar" \
--validation_image \
"conditioning_image_1.png" \
"conditioning_image_2.jpeg" \
--validation_prompt \
"a dog sitting on the grass" \
"home office" \
--validation_steps=500 \
--checkpointing_steps=1000 --checkpoints_total_limit=10 \
--train_batch_size=${BS} \
--gradient_checkpointing --enable_xformers_memory_efficient_attention \
--gradient_accumulation_steps=1 \
--use_8bit_adam \
--resume_from_checkpoint=latest \
--mixed_precision="fp16" \
--tracker_project_name="controlnet_sd15_tile_v3" \
--tracker_project_trail_name ${trail_name} \
--proportion_empty_prompts ${PROMPT_DROPOUT} \
--report_to=wandb
I see. That's a bit weird then. Why did you set --max_train_samples=85000
though (meaning that you only trained on 85K samples)?
max_train_samples
Sorry, that was a typo. I revised this value in later experiments; it was originally set to 500,000. Did you also use LAION's text during the training process? Have you tried leaving the text empty? It seems like the use of controlnet tile doesn't require text guidance.
Did you also use LAION's text during the training process? Have you tried leaving the text empty? It seems like the use of controlnet tile doesn't require text guidance.
No I didn't try leaving the text empty and was using LAION's text. It is true though that after training when I change the prompt for the same input image, the output doesn't seem to change much.
Did you also use LAION's text during the training process? Have you tried leaving the text empty? It seems like the use of controlnet tile doesn't require text guidance.
No I didn't try leaving the text empty and was using LAION's text. It is true though that after training when I change the prompt for the same input image, the output doesn't seem to change much.
Thanks for your help. Please let me know if anyone else meet the same issue
Will do. As of now I haven't heard anyone else reporting the same issue
Hello Steven, First of all, I would like to express my gratitude for your detailed tutorial. I use your train script but can not replicate the result.Here is the difference of my train details:
The other training parameters are exactly the same as those shown in the README would you care to share some tips or kind advice? Many thanks my result is