A bug about one-step inference in PixArtAlphaPipeline

Luo-Yihong commented 5 months ago

Describe the bug

When implementing the PixArtAlphaPipeline, one step of inference was bound to the DMD, which is inappropriate. This resulted in errors in other one-step inference codes based on Pixar-alpha.

Reproduction

`import torch from diffusers import PixArtAlphaPipeline, LCMScheduler, Transformer2DModel

transformer = Transformer2DModel.from_pretrained( "Luo-Yihong/yoso_pixart1024", torch_dtype=torch.float16).to('cuda')

pipe = PixArtAlphaPipeline.from_pretrained("PixArt-alpha/PixArt-XL-2-512x512", transformer=transformer, torch_dtype=torch.float16, use_safetensors=True)

pipe = pipe.to('cuda') pipe.scheduler = LCMScheduler.from_config(pipe.scheduler.config) pipe.scheduler.config.prediction_type = "v_prediction" generator = torch.manual_seed(318) imgs = pipe(prompt="Pirate ship trapped in a cosmic maelstrom nebula, rendered in cosmic beach whirlpool engine, volumetric lighting, spectacular, ambient lights, light pollution, cinematic atmosphere, art nouveau style, illustration art artwork by SenseiJaye, intricate detail.", num_inference_steps=1, num_images_per_prompt = 1, generator = generator, guidance_scale=1., )[0] imgs[0] ` The code is not able to run for now.

Logs

File D:\ComfyUI\venv\lib\site-packages\diffusers\pipelines\pixart_alpha\pipeline_pixart_alpha.py:942, in PixArtAlphaPipeline.call(self, prompt, negative_prompt, num_inference_steps, timesteps, sigmas, guidance_scale, num_images_per_prompt, height, width, eta, generator, latents, prompt_embeds, prompt_attention_mask, negative_prompt_embeds, negative_prompt_attention_mask, output_type, return_dict, callback, callback_steps, clean_caption, use_resolution_binning, max_sequence_length, **kwargs)
939 # compute previous image: x_t -> x_t-1
940 if num_inference_steps == 1:
941 # For DMD one step sampling: https://arxiv.org/abs/2311.18828
--> 942 latents = self.scheduler.step(noise_pred, t, latents, **extra_step_kwargs).pred_original_sample
943 else:
944 latents = self.scheduler.step(noise_pred, t, latents, **extra_step_kwargs, return_dict=False)[0]

AttributeError: 'LCMSchedulerOutput' object has no attribute 'pred_original_sample'

System Info

The newest diffuser.

Who can help?

@yiyixuxu @DN6

Luo-Yihong commented 5 months ago

The PixArtAlphaPipeline should not default to using DMD for one-step inference, as this will cause codes that were originally running normally to error out.

For example, the demo usage of one-step inference using YOSO-Pixart is unable to run for now.

sayakpaul commented 5 months ago

Cc: @lawrence-cj

github-actions[bot] commented 2 months ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

a-r-r-o-w commented 1 week ago

Gentle ping @sayakpaul @lawrence-cj since this still seems to be unfixed/broken. If you have a fix @Luo-Yihong, would you like to help us with a PR?

sayakpaul commented 1 week ago

Is this fixed with the recent versions of diffusers? Could you try?

huggingface / diffusers