garibida / ReNoise-Inversion

Officail Implementation for "ReNoise: Real Image Inversion Through Iterative Noising"
https://garibida.github.io/ReNoise-Inversion/
145 stars 5 forks source link

How do I find zT with only ddim inversion without renoise? #7

Closed sangh0Kim closed 1 week ago

sangh0Kim commented 2 weeks ago

Thank you for sharing the code! Such a great work!

I have a question. In ReNoise paper, SDXL is comparing the result between ReNoise and performing ddim inversion without ReNoise. Even if num_renoise_step is set to 0, it is not pure ddim inversion, right? Actually, I would like to compare the result of using only ddim inversion in SDXL with the result of ReNoise, so could you share the ddim inversion code please?

Thank you!

garibida commented 2 weeks ago

Hi,

Using the following configuration is equal to regular DDIM Inversion:

model_type = Model_Type.SDXL
scheduler_type = Scheduler_Type.DDIM
pipe_inversion, pipe_inference = get_pipes(model_type, scheduler_type, device=device)

config = RunConfig(model_type = model_type,
                    num_inference_steps = 50,
                    num_inversion_steps = 50,
                    num_renoise_steps = 0,
                    scheduler_type = scheduler_type,
                    perform_noise_correction = False,
                    noise_regularization_num_reg_steps = 0
                    seed = 7865)

The difference is setting the num_renoise_step to 0 and disabling the Noise regularization.

sangh0Kim commented 2 weeks ago

Thank you so much for your quick reply! Your reply has enabled us to check the original ddim-inversion result for most images in general. However, there is one additional problem.

All other images work well with ddim-inversion to restore images similar to the original images, but ddim-inversion does not work only with images in cityscape datasets. I tried to fix the problem, but it didn't work. Is there a reason to guess?

The result below is the result of trying ddim-inversion in cityscape. Of course, there is a problem with all other cityscape images besides this one. This is original cityscape image, edit_test

And this is ddim-inversion result. reconstructed image

The inversion works well for other road environment images. edit_test3 Above is original image, and below is ddim-inversion result. reconstructed image

Why is the problem only with cityscape image?

Thank you.

garibida commented 2 weeks ago

I'm glad my previous response helped! Did you use a specific prompt during the inversion process, or did you leave the prompt empty? Even minor changes in the prompt can significantly alter the DDIM inversion results, especially for complex images like cityscapes. We found DDIM inversion to be very sensitive to prompts. That is one of the problems that ReNoise addresses.

Even though the result looks terrible, like there is some bug. Out of curiosity, could you show me the result after one ReNoise step?

sangh0Kim commented 2 weeks ago

Once again, I admire your kindness. Thank you very much for your quick response.

When I previously performed ddim inversion, as you guessed, I left the prompt part blank. I typed the prompt for the cityscape image and found that ddim inversion works well!

By the way, I found one cause of inversion instability. Currently default is dtype = float16, and if I try ddim inversion or renoise by changing it to bfloat16, the cityscape image is ruined at both original ddim and ReNoise. Instead, other images except cityscape on bfloat16 works well in inversion. (Of course, all text prompts are blank.) With float16, the ddim inversion result of cityscape is still weird, but the result of renoise 1step is good without text.

Below is the result of cityscape ReNoise 1 step on dtype=float16. cityscape float16 renoise1step It's working very well!

I think it's a problem caused by my arbitrarily changing dtype. Thank you for your response!