Open s9anus98a opened 5 months ago
it's because we expect latents
to be a tensor https://github.com/huggingface/diffusers/blob/f96e4a16adb4c31bab4c0a3d0d145ed2b086ecb0/src/diffusers/pipelines/stable_diffusion_3/pipeline_stable_diffusion_3.py#L635C9-L635C16
it's because we expect
latents
to be a tensor https://github.com/huggingface/diffusers/blob/f96e4a16adb4c31bab4c0a3d0d145ed2b086ecb0/src/diffusers/pipelines/stable_diffusion_3/pipeline_stable_diffusion_3.py#L635C9-L635C16
so it should output tensor latent if set output_type='latent'
inv_latents is from here:
inv_latents= pipe(prompt="", negative_prompt="", guidance_scale=1.,
width=input_img.shape[-1], height=input_img.shape[-2],
output_type='latent', return_dict=False,
num_inference_steps=num_steps, latents=latents)
Hi, if you use output_type="latent"
the pipeline returns a tensor.
Can you please post a reproducible code snippet with all the relevant parts of what you're doing? It's hard for us to help if you only provide parts of it.
Hi @asomoza @yiyixuxu here's the full reproducible code
SD3 DDIM Inversion https://colab.research.google.com/drive/1B0qGpwsEjpOm3xx_LzraHWYehTTgB8AL
I took a look at your code, since you're using return_dict=False
in the fist generation, for the second one you'll need to pass the latents like this:
image = pipe(prompt="", negative_prompt="", guidance_scale=1.,
num_inference_steps=20, latents=inv_latents[0])
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
script to produce:
inv_latents value print: