Closed YaqiWangCV closed 1 week ago
Thanks for your attention to our work! Can you show me some examples?
It seems that the content of images in different epoch is very different.
This phenomenon is very strange; we did not encounter such a situation during the training process. Can you test the results using the pre-trained models we provided? If there are no issues, then the problem might be with the training process. If the issue persists, then it might be a problem with the testing process.
Testing directly with the pre-trained models you provided works fine; the problem lies in the training process.
This is my training script:
accelerate launch train_addsr.py \
--pretrained_model_name_or_path="/pretrained_models/stable-diffusion-2-base" \
--controlnet_model_name_or_path_Tea='/pretrained_models/SeeSR/models/seesr' \
--unet_model_name_or_path_Tea='/pretrained_models/SeeSR/models/seesr' \
--controlnet_model_name_or_path_Stu='/pretrained_models/SeeSR/models/seesr' \
--unet_model_name_or_path_Stu='/pretrained_models/SeeSR/models/seesr' \
--output_dir ${output_dir} \
--root_folders '/dataset/lowlevel/seesr_ori' \
--ram_ft_path '/pretrained_models/SeeSR/models/DAPE.pth' \
--enable_xformers_memory_efficient_attention \
--mixed_precision="fp16" \
--resolution=512 \
--learning_rate=2e-5 \
--train_batch_size=6 \
--gradient_accumulation_steps=2 \
--null_text_ratio=0.5 \
--dataloader_num_workers=4 \
--max_train_steps=50000 \
--checkpointing_steps=500 \
From your provided training script, the problem might lie in the train_batch_size setting. Since the train_addsr.py script we provided only supports train_batch_size=2, setting it to any other number will result in incorrect loss calculations and training. You can either rewrite the training code to match train_batch_size=6 or set it to 2, and then the problem should disappear. Sorry for the inconvenience.
I'm very sorry, I tried setting train_batch_size=2, but the issue still persists. In each epoch, the model tends to predict all images as the same thing, for example, all being faces.
Can you show me the loss curves?
I retrained the network for 5k iterations, and it performed well. Based on your examples, I suspect there might be an issue with how your model is being loaded. Can you provide the detailed training log?
Thank you very much. I found out it was the training data.
Hello, thank you very much for your excellent work.
During training, controlnet0, controlnet2, unet1, and unet3 are saved. In the testing phase, I loaded controlnet2 and unet3, but the resulting super-resolution images were completely different from the original ones, with the image elements being totally out of control (they might all be faces, or some kind of landscape images). Could you please tell me what might be causing this?