huggingface / diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch and FLAX.
https://huggingface.co/docs/diffusers
Apache License 2.0
25.4k stars 5.26k forks source link

A bug about one-step inference in PixArtAlphaPipeline #8689

Open Luo-Yihong opened 3 months ago

Luo-Yihong commented 3 months ago

Describe the bug

When implementing the PixArtAlphaPipeline, one step of inference was bound to the DMD, which is inappropriate. This resulted in errors in other one-step inference codes based on Pixar-alpha.

Reproduction

`import torch from diffusers import PixArtAlphaPipeline, LCMScheduler, Transformer2DModel

transformer = Transformer2DModel.from_pretrained( "Luo-Yihong/yoso_pixart1024", torch_dtype=torch.float16).to('cuda')

pipe = PixArtAlphaPipeline.from_pretrained("PixArt-alpha/PixArt-XL-2-512x512", transformer=transformer, torch_dtype=torch.float16, use_safetensors=True)

pipe = pipe.to('cuda') pipe.scheduler = LCMScheduler.from_config(pipe.scheduler.config) pipe.scheduler.config.prediction_type = "v_prediction" generator = torch.manual_seed(318) imgs = pipe(prompt="Pirate ship trapped in a cosmic maelstrom nebula, rendered in cosmic beach whirlpool engine, volumetric lighting, spectacular, ambient lights, light pollution, cinematic atmosphere, art nouveau style, illustration art artwork by SenseiJaye, intricate detail.", num_inference_steps=1, num_images_per_prompt = 1, generator = generator, guidance_scale=1., )[0] imgs[0] ` The code is not able to run for now.

Logs

File D:\ComfyUI\venv\lib\site-packages\diffusers\pipelines\pixart_alpha\pipeline_pixart_alpha.py:942, in PixArtAlphaPipeline.call(self, prompt, negative_prompt, num_inference_steps, timesteps, sigmas, guidance_scale, num_images_per_prompt, height, width, eta, generator, latents, prompt_embeds, prompt_attention_mask, negative_prompt_embeds, negative_prompt_attention_mask, output_type, return_dict, callback, callback_steps, clean_caption, use_resolution_binning, max_sequence_length, **kwargs)
939 # compute previous image: x_t -> x_t-1
940 if num_inference_steps == 1:
941 # For DMD one step sampling: https://arxiv.org/abs/2311.18828
--> 942 latents = self.scheduler.step(noise_pred, t, latents, **extra_step_kwargs).pred_original_sample
943 else:
944 latents = self.scheduler.step(noise_pred, t, latents, **extra_step_kwargs, return_dict=False)[0]

AttributeError: 'LCMSchedulerOutput' object has no attribute 'pred_original_sample'

System Info

The newest diffuser.

Who can help?

@yiyixuxu @DN6

Luo-Yihong commented 3 months ago

The PixArtAlphaPipeline should not default to using DMD for one-step inference, as this will cause codes that were originally running normally to error out.

For example, the demo usage of one-step inference using YOSO-Pixart is unable to run for now.

sayakpaul commented 3 months ago

Cc: @lawrence-cj

github-actions[bot] commented 2 weeks ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.