Error introduced when using p2p pipeline comparing to null-text inversion

Hey, I encounter a weird issue that when I use p2p, the reconstruction of the original image (which is successfully reconstructed before by null-text inversion) has errors:

My original and reconstructed image by inversion:

outputs when using p2p:

Even when I use the origi

nal controller without any attention swap, the error exists as long as I have a new prompt:

Seems like when a second prompt is added, it will affect the context parameter for ptp_utils.diffusion_step(model, controller, latents, context, t, guidance_scale, low_resource=False), which affects the prediction: noise_pred = model.unet(latents_input, t, encoder_hidden_states=context)["sample"] in p2p_utils.py.

I am wondering if anybody know why I have this issue? Thanks! (image credit source: https://billf.mit.edu/about/shapetime)

google / prompt-to-prompt

Error introduced when using p2p pipeline comparing to null-text inversion #86