I have been interested in utilizing DDS with Stable Diffusion 3 and applying the method directly does not yield satisfactory results. Since the latent space is different in Stable Diffusion 3, as compared to 2.1 and 1.5, I have adjusted the learning rate. I have tried a lot of values for the learning rate, none of which produce good results:
Here's an image produced by using the scheduler for Stable Diffusion 3, with a learning rate of 0.005 (target prompt is "a photo of a tiger" and the source prompt is "a photo of a cat"):
and lr = 0.0075:
both results are a far cry of what you have showcased using DDS. I also explored using the Stable Diffusion 2 scheduler, instead of the recommended scheduler for stable diffusion 3, which yields similar results (learning rates 0.01 and 0.015 respectively):
When testing I found a configuration that works surprisingly well using the stable diffusion 3 scheduler. By setting the t_min and t_max variables to 20 and 400, as well as adding weighting (in accordance with the scheduler) and doing 250 iterations I was able to generate this image:
However, an issue arises when the algorithm is run for longer such as ~500 iterations where the image becomes overly saturated (this is particularly an issue for trying to change seasons in an image i.e. from summer to winter, where longer training is needed for larger changes):
It would be great if you could share any insights into why this might be the case and how it could be solved, such that DDS can be adapted for use with Stable Diffusion 3 (medium)!
I include the modified code that I used to create the images (configured for the successful tiger modification):
Hello and thank you for the great work!
I have been interested in utilizing DDS with Stable Diffusion 3 and applying the method directly does not yield satisfactory results. Since the latent space is different in Stable Diffusion 3, as compared to 2.1 and 1.5, I have adjusted the learning rate. I have tried a lot of values for the learning rate, none of which produce good results:
Here's an image produced by using the scheduler for Stable Diffusion 3, with a learning rate of 0.005 (target prompt is "a photo of a tiger" and the source prompt is "a photo of a cat"):
and lr = 0.0075:
both results are a far cry of what you have showcased using DDS. I also explored using the Stable Diffusion 2 scheduler, instead of the recommended scheduler for stable diffusion 3, which yields similar results (learning rates 0.01 and 0.015 respectively):
When testing I found a configuration that works surprisingly well using the stable diffusion 3 scheduler. By setting the t_min and t_max variables to 20 and 400, as well as adding weighting (in accordance with the scheduler) and doing 250 iterations I was able to generate this image:
However, an issue arises when the algorithm is run for longer such as ~500 iterations where the image becomes overly saturated (this is particularly an issue for trying to change seasons in an image i.e. from summer to winter, where longer training is needed for larger changes):
It would be great if you could share any insights into why this might be the case and how it could be solved, such that DDS can be adapted for use with Stable Diffusion 3 (medium)!
I include the modified code that I used to create the images (configured for the successful tiger modification):