I want to train my own customized model. For testing, I tried to reproduce the customized model (Taylor Swift) by diffusers library.
However, the performance is worse a lot than the provided pre-trained model. I wonder about the training configs such as # of training data, learning rate, and max_train_steps.
For human face, the training version that finetune the encoder would yield a better performance. I also mentioned that in the README.me. Your used two scripts do not include encoder finetuning
Thank you for the perfect work!
I want to train my own customized model. For testing, I tried to reproduce the customized model (Taylor Swift) by diffusers library.
However, the performance is worse a lot than the provided pre-trained model. I wonder about the training configs such as # of training data, learning rate, and max_train_steps.
I use almost the same command in https://github.com/huggingface/diffusers/tree/main/examples/dreambooth but change the resolution to 768 (StableDiffusion-2) with only 8 training images.
For example:
accelerate launch train_dreambooth.py \ --pretrained_model_name_or_path=$MODEL_NAME \ --instance_data_dir=$INSTANCE_DIR \ --output_dir=$OUTPUT_DIR \ --instance_prompt="a photo of a sks woman" \ --resolution=768 \ --train_batch_size=1 \ --gradient_accumulation_steps=1 \ --learning_rate=5e-6 \ --lr_scheduler="constant" \ --use_8bit_adam \ --enable_xformers_memory_efficient_attention \ --lr_warmup_steps=0 \ --max_train_steps=400 \ --push_to_hub
I also tried training with prior-preservation loss :
accelerate launch train_dreambooth.py \ --pretrained_model_name_or_path=$MODEL_NAME \ --instance_data_dir=$INSTANCE_DIR \ --class_data_dir=$CLASS_DIR \ --output_dir=$OUTPUT_DIR \ --with_prior_preservation --prior_loss_weight=1.0 \ --instance_prompt="a photo of a sks woman" \ --class_prompt="a photo of a woman" \ --resolution=768\ --train_batch_size=1 \ --gradient_accumulation_steps=1 \ --learning_rate=5e-6 \ --lr_scheduler="constant" \ --use_8bit_adam \ --enable_xformers_memory_efficient_attention \ --lr_warmup_steps=0 \ --num_class_images=200 \ --max_train_steps=800 \ --push_to_hub
I'd appreciate any tip. Thank you!