[Issue]: dev branch: after rendering Stable Video Diffusion in Image tab, you're stuck with SVD pipeline

Tillerz commented 6 months ago

Issue Description

After rendering Stable Video Diffusion in Image tab, that pipeline gets stuck, meaning: you can select "None" in the Script dropdown box and click Process again, it still will render as video again.

If you go to Text tab, no image will get rendered here anymore, because it throws an error: StableVideoDiffusionPipeline.call() missing 1 required positional argument: 'image' Looks like it is carrying over the old "Script" settings from the Image tab even if you did set Script to "None".

It also doesn't matter if you manually set the "Diffusers pipeline" manually to "Autodetect" or "Stable Diffusion", it stays stuck in Stable Video Diffusion pipeline.

Even a plain server restart does not fix it automatically: the video model used now shows in the "Base Model" select box. Changing that back to an SD 1.5 model etc fixed the pipeline back to where you want it. Apparently, when switching to video via 'Script' dropdown box, it changes the model for you, but it doesn't update the 'Base model' select box accordingly, so you do not see the model has changed.

Version Platform Description

Version 070ae614, WSL2, Chrome

Relevant log output

21:53:03-602205 WARNING  Pipeline class change failed: type=DiffusersTaskType.TEXT_2_IMAGE pipeline=StableVideoDiffusionPipeline AutoPipeline can't find a pipeline linked to
                         StableVideoDiffusionPipeline for None
21:53:03-604967 WARNING  Pipeline class change failed: type=DiffusersTaskType.TEXT_2_IMAGE pipeline=StableVideoDiffusionPipeline AutoPipeline can't find a pipeline linked to
                         StableVideoDiffusionPipeline for None
21:53:03-630008 INFO     Base: class=StableVideoDiffusionPipeline
21:53:03-632368 DEBUG    Sampler: sampler="DPM++ 2M" config={'num_train_timesteps': 1000, 'beta_start': 0.00085, 'beta_end': 0.012, 'beta_schedule': 'scaled_linear',
                         'prediction_type': 'v_prediction', 'thresholding': False, 'sample_max_value': 1.0, 'algorithm_type': 'sde-dpmsolver++', 'solver_type': 'midpoint',
                         'lower_order_final': False, 'use_karras_sigmas': True, 'final_sigmas_type': 'zero', 'timestep_spacing': 'leading'}
21:53:03-636403 DEBUG    Diffuser pipeline: StableVideoDiffusionPipeline task=DiffusersTaskType.IMAGE_2_IMAGE set={'generator': device(type='cuda'), 'num_inference_steps': 100,
                         'output_type': 'latent', 'parser': 'Fixed attention'}
21:53:03-654100 ERROR    Exception: StableVideoDiffusionPipeline.__call__() missing 1 required positional argument: 'image'
21:53:03-655043 ERROR    Arguments: args=('task(ulsibdqduxbdwzk)', 'fast car', '', [], 40, 4, 4, True, False, False, False, 1, 1, 6, 0, 0.7, 0, 0.5, 1, -1.0, -1.0, 0, 0, 0, 768,
                         512, True, 0.4, 2, 'ESRGAN 4x Remacri', True, 10, 0, 0, 10, 0, '', '', 0, 0, 0, 0, False, 4, 0.95, False, 0.6, 1, '#000000', 0, [], 0, 1, 'None', 'None',
                         'None', 'None', 0.5, 0.5, 0.5, 0.5, None, None, None, None, 0, 0, 0, 0, 1, 1, 1, 1, None, None, None, None, False, '', 'None', 16, 'None', 1, True,
                         'None', 2, True, 1, 0, True, 'none', 3, 4, 0.25, 0.25, False, False, {'ad_model': 'mediapipe_face_full', 'ad_model_classes': '', 'ad_prompt': '',
                         'ad_negative_prompt': '', 'ad_confidence': 0.3, 'ad_mask_k_largest': 0, 'ad_mask_min_ratio': 0, 'ad_mask_max_ratio': 1, 'ad_x_offset': 0, 'ad_y_offset':
                         0, 'ad_dilate_erode': 4, 'ad_mask_merge_invert': 'None', 'ad_mask_blur': 4, 'ad_denoising_strength': 0.4, 'ad_inpaint_only_masked': True,
                         'ad_inpaint_only_masked_padding': 32, 'ad_use_inpaint_width_height': False, 'ad_inpaint_width': 512, 'ad_inpaint_height': 512, 'ad_use_steps': False,
                         'ad_steps': 28, 'ad_use_cfg_scale': False, 'ad_cfg_scale': 7, 'ad_use_checkpoint': False, 'ad_checkpoint': 'Use same checkpoint', 'ad_use_vae': False,
                         'ad_vae': 'Use same VAE', 'ad_use_sampler': False, 'ad_sampler': 'Default', 'ad_scheduler': 'Use same scheduler', 'ad_use_noise_multiplier': False,
                         'ad_noise_multiplier': 1, 'ad_use_clip_skip': False, 'ad_clip_skip': 1, 'ad_restore_face': False, 'ad_controlnet_model': 'None', 'ad_controlnet_module':
                         'None', 'ad_controlnet_weight': 1, 'ad_controlnet_guidance_start': 0, 'ad_controlnet_guidance_end': 1, 'is_api': ()}, {'ad_model': 'None',
                         'ad_model_classes': '', 'ad_prompt': '', 'ad_negative_prompt': '', 'ad_confidence': 0.3, 'ad_mask_k_largest': 0, 'ad_mask_min_ratio': 0,
                         'ad_mask_max_ratio': 1, 'ad_x_offset': 0, 'ad_y_offset': 0, 'ad_dilate_erode': 4, 'ad_mask_merge_invert': 'None', 'ad_mask_blur': 4,
                         'ad_denoising_strength': 0.4, 'ad_inpaint_only_masked': True, 'ad_inpaint_only_masked_padding': 32, 'ad_use_inpaint_width_height': False,
                         'ad_inpaint_width': 512, 'ad_inpaint_height': 512, 'ad_use_steps': False, 'ad_steps': 28, 'ad_use_cfg_scale': False, 'ad_cfg_scale': 7,
                         'ad_use_checkpoint': False, 'ad_checkpoint': 'Use same checkpoint', 'ad_use_vae': False, 'ad_vae': 'Use same VAE', 'ad_use_sampler': False, 'ad_sampler':
                         'Default', 'ad_scheduler': 'Use same scheduler', 'ad_use_noise_multiplier': False, 'ad_noise_multiplier': 1, 'ad_use_clip_skip': False, 'ad_clip_skip':
                         1, 'ad_restore_face': False, 'ad_controlnet_model': 'None', 'ad_controlnet_module': 'None', 'ad_controlnet_weight': 1, 'ad_controlnet_guidance_start': 0,
                         'ad_controlnet_guidance_end': 1, 'is_api': ()}, False, False, 1, False, False, False, 1.1, 1.5, 100, 0.7, False, False, True, False, False, 0,
                         'Gustavosta/MagicPrompt-Stable-Diffusion', '', False, False, False, False, 'base', 3, 1, 1, 0.8, 8, 64, True, 1, 1, 0.5, 0.5, False, False, 'positive',
                         'comma', 0, False, False, '', 'None', '', 1, '', 'None', True, 0, 'None', 2, True, 1, 0, 0, '', [], 0, '', [], 0, '', [], False, True, False, False,
                         False, False, 0, 'None', [], 'FaceID Base', True, True, 1, 1, 1, 0.5, True, 'person', 1, 0.5, True) kwargs={}
21:53:03-660976 ERROR    gradio call: TypeError
╭────────────────────────────────────────────────────────────────────── Traceback (most recent call last) ───────────────────────────────────────────────────────────────────────╮
│ /home/tillerz/dev/modules/call_queue.py:31 in f                                                                                                                                │
│                                                                                                                                                                                │
│   30 │   │   │   try:                                                                                                                                                          │
│ ❱ 31 │   │   │   │   res = func(*args, **kwargs)                                                                                                                               │
│   32 │   │   │   │   progress.record_results(id_task, res)                                                                                                                     │
│                                                                                                                                                                                │
│ /home/tillerz/dev/modules/txt2img.py:89 in txt2img                                                                                                                             │
│                                                                                                                                                                                │
│   88 │   if processed is None:                                                                                                                                                 │
│ ❱ 89 │   │   processed = processing.process_images(p)                                                                                                                          │
│   90 │   p.close()                                                                                                                                                             │
│                                                                                                                                                                                │
│ /home/tillerz/dev/modules/processing.py:189 in process_images                                                                                                                  │
│                                                                                                                                                                                │
│   188 │   │   │   with context_hypertile_vae(p), context_hypertile_unet(p):                                                                                                    │
│ ❱ 189 │   │   │   │   processed = process_images_inner(p)                                                                                                                      │
│   190                                                                                                                                                                          │
│                                                                                                                                                                                │
│ /home/tillerz/dev/modules/processing.py:307 in process_images_inner                                                                                                            │
│                                                                                                                                                                                │
│   306 │   │   │   │   │   from modules.processing_diffusers import process_diffusers                                                                                           │
│ ❱ 307 │   │   │   │   │   x_samples_ddim = process_diffusers(p)                                                                                                                │
│   308 │   │   │   │   else:                                                                                                                                                    │
│                                                                                                                                                                                │
│ /home/tillerz/dev/modules/processing_diffusers.py:472 in process_diffusers                                                                                                     │
│                                                                                                                                                                                │
│   471 │   │   apply_hidiffusion()                                                                                                                                              │
│ ❱ 472 │   │   output = shared.sd_model(**base_args) # pylint: disable=not-callable                                                                                             │
│   473 │   │   if isinstance(output, dict):                                                                                                                                     │
│                                                                                                                                                                                │
│ /home/tillerz/dev/venv/lib/python3.10/site-packages/torch/utils/_contextlib.py:115 in decorate_context                                                                         │
│                                                                                                                                                                                │
│   114 │   │   with ctx_factory():                                                                                                                                              │
│ ❱ 115 │   │   │   return func(*args, **kwargs)                                                                                                                                 │
│   116                                                                                                                                                                          │
╰────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
TypeError: StableVideoDiffusionPipeline.__call__() missing 1 required positional argument: 'image'

Backend

Diffusers

Branch

Dev

Model

SD 1.5

Acknowledgements

[X] I have read the above and searched for existing issues
[X] I confirm that this is classified correctly and its not an extension issue

vladmandic commented 6 months ago

SVD is a large standalone model that is loaded as any other model - its not using separate placeholder purely because of its size.

I cannot place standard loaded model "aside" and restore pipeline once SVD is done as that would be extremely slow. Also, how would I know ahead of time if next run is going to be with SVD again or not? Imagine if I restore a pipeline on SVD completion, but then you want to run another SVD generate so I'd have to load it again?

So once you're done with SVD and you want to go back to SD15 or SDXL or whatever, you need to load whatever model you want to use.

I'm not sure how to better handle it - if there is actual proposal, let me know. And in either case, that would be a feature request, not an issue.

Tillerz commented 6 months ago

Ok, totally understood. Problem is that the video model doesn't automatically show up on the model dropdown box. So knowing that it changed (and just didn't update the ui) is good enough for a manual workaround. I guess just selecting a different 1.5 model and then back to the 1.5 I usually use should fix it without a full restart. :)

vladmandic commented 6 months ago

unfortunately it's not really possible to update model UI element from that code - Gradio just doesn't see it as updated.

Tillerz commented 6 months ago

Not fixable due to gradio.

vladmandic / automatic