[Bug]: RuntimeError: Input type (float) and bias type (struct c10::Half) using fixed FP16 SDXL VAE with hires fix

freecoderwaifu commented 1 year ago

Is there an existing issue for this?

[X] I have searched the existing issues and checked the recent builds/commits

What happened?

Using hires fix with the FP16 SDXL fixed VAE causes RuntimeError: Input type (float) and bias type (struct c10::Half) should be the same, understandable maybe since it's mixing precision.

https://huggingface.co/madebyollin/sdxl-vae-fp16-fix/tree/main

However switching VAE to Automatic, getting a NaN error, letting it fix itself with Automaticlly revert VAE to 32-bit floats, then selecting the fixed FP16 VAE again lets it work as expected.

This makes sense if precision is all switched to FP32 after the NaN fix, unsure if that's the case, but still might or might not entirely be the desired behavior, especially trying to get the fixed VAE to work right away with hires fix.

Steps to reproduce the problem

Use hiresfix with fixed FP16 SDXL VAE
Get RuntimeError: Input type (float) and bias type (struct c10::Half) should be the same error
Set VAE to Automatic
Get NaN error, let it sort itself using Automaticlly revert VAE to 32-bit floats
Load fixed FP16 SDXL VAE again
Works as intended

What should have happened?

FP16 SDXL VAE works right away with hires fix.

Version or Commit where the problem happens

a1eb496

What Python version are you running on ?

Python 3.10.x

What platforms do you use to access the UI ?

Windows

What device are you running WebUI on?

Nvidia GPUs (RTX 20 above)

Cross attention optimization

xformers

What browsers do you use to access the UI ?

Mozilla Firefox

Command Line Arguments

--xformers --no-download-sd-model --no-hashing --ad-no-huggingface

List of extensions

Console logs

Traceback (most recent call last):
      File "E:\variousAI\stable-diffusion-webui\modules\call_queue.py", line 58, in f
        res = list(func(*args, **kwargs))
      File "E:\variousAI\stable-diffusion-webui\modules\call_queue.py", line 37, in f
        res = func(*args, **kwargs)
      File "E:\variousAI\stable-diffusion-webui\modules\txt2img.py", line 62, in txt2img
        processed = processing.process_images(p)
      File "E:\variousAI\stable-diffusion-webui\modules\processing.py", line 677, in process_images
        res = process_images_inner(p)
      File "E:\variousAI\stable-diffusion-webui\extensions\sd-webui-controlnet\scripts\batch_hijack.py", line 42, in processing_process_images_hijack
        return getattr(processing, '__controlnet_original_process_images_inner')(p, *args, **kwargs)
      File "E:\variousAI\stable-diffusion-webui\modules\processing.py", line 794, in process_images_inner
        samples_ddim = p.sample(conditioning=p.c, unconditional_conditioning=p.uc, seeds=p.seeds, subseeds=p.subseeds, subseed_strength=p.subseed_strength, prompts=p.prompts)
      File "E:\variousAI\stable-diffusion-webui\modules\processing.py", line 1109, in sample
        samples = self.sd_model.get_first_stage_encoding(self.sd_model.encode_first_stage(decoded_samples))
      File "E:\variousAI\stable-diffusion-webui\venv\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
        return func(*args, **kwargs)
      File "E:\variousAI\stable-diffusion-webui\repositories\generative-models\sgm\models\diffusion.py", line 127, in encode_first_stage
        z = self.first_stage_model.encode(x)
      File "E:\variousAI\stable-diffusion-webui\repositories\generative-models\sgm\models\autoencoder.py", line 321, in encode
        return super().encode(x).sample()
      File "E:\variousAI\stable-diffusion-webui\repositories\generative-models\sgm\models\autoencoder.py", line 308, in encode
        h = self.encoder(x)
      File "E:\variousAI\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
        return forward_call(*args, **kwargs)
      File "E:\variousAI\stable-diffusion-webui\repositories\generative-models\sgm\modules\diffusionmodules\model.py", line 576, in forward
        hs = [self.conv_in(x)]
      File "E:\variousAI\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
        return forward_call(*args, **kwargs)
      File "E:\variousAI\stable-diffusion-webui\extensions-builtin\Lora\networks.py", line 376, in network_Conv2d_forward
        return torch.nn.Conv2d_forward_before_network(self, input)
      File "E:\variousAI\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\conv.py", line 463, in forward
        return self._conv_forward(input, self.weight, self.bias)
      File "E:\variousAI\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\conv.py", line 459, in _conv_forward
        return F.conv2d(input, weight, bias, self.stride,
    RuntimeError: Input type (float) and bias type (struct c10::Half) should be the same

Additional information

No response

GuruVirus commented 1 year ago

It is independent of user config: I get the error with a 4090 and --opt-sdp-no-mem-attention The auto-workaround also works for me.

djdookie commented 1 year ago

I also have that error when using SDXL with hires fix with any upscaler. Only latent upscaler works.

dwcarr commented 1 year ago

This started spontaneously for me after doing dozens of high-res fixes using ultrasharp on sdxl with no problem, suddenly I am getting this error. VAE always automatic. Running on a 4080, windows 10. I get this error every time now, even on things that used to work with the exact same settings.

I am able to manually upres in extras, and then send to img2img for refinement without any errors.

jonathanschoeller commented 1 year ago

For me it makes no difference whether I'm using sdxl-vae-fp16-fix/tree/main or not, or whether I have enabled Automaticlly revert VAE to 32-bit floats or not.

Hi-res fix with 4x-ultrasharp and other non-latent upscalers in combination with several SDXL models including sdXL_v10VAEFix, and sdvn6Realxl_detailface all give me this same error.

catboxanon commented 1 year ago

Can one of you having this issue confirm whether or not it's still an issue as of 1.6.0-RC or the dev branch? There were a couple of fixes made involving VAEs recently and that may have resolved this. If it's still not resolved, please consider sharing the metadata for what you are trying to generate and I or somebody else can try looking more into this.

thulle commented 1 year ago

@catboxanon I ran a few tests with R-ESGRAN 4x+ and SwinIR_4x, seems to work as intended now. Thanks a lot!

jonathanschoeller commented 1 year ago

@catboxanon This has resolved the issue for me. Thanks!

git checkout tags/v1.6.0-RC

Possibly relevant config.json:

{
    "sd_model_checkpoint": "nightvisionXLPhotorealisticPortrait_beta0681Bakedvae.safetensors [2f602b1df5]",
    "sd_vae": "Automatic",
    "sd_vae_as_default": true,
    "auto_vae_precision": true
}

Steps: 40, 
NGMS: 0.5, 
Size: 896x1152, 
Sampler: DPM++ 2M Karras, 
Hires steps: 10, 
Hires upscale: 1.5, Hires upscaler: 4x-UltraSharp, 
Denoising strength: 0.4, Token merging ratio: 0.1, Token merging ratio hr: 0.1

catboxanon commented 1 year ago

Does it work even with the fp16 SDXL VAE and the revert VAE option disabled? That's moreso what I'm curious about since that's what OP originally reported.

thulle commented 1 year ago

@catboxanon I got the idea to update all extensions and it blew up my install, but I can confirm that the VAE-fixes works.

Did a clean checkout from github, unchecked "Automatically revert VAE to 32-bit floats", using VAE: sdxl_vae_fp16_fix.safetensors, upscaling with Hires upscale: 2, Hires upscaler: R-ESRGAN 4x+

footer shown as version: v1.6.0-RC • python: 3.11.4 • torch: 2.0.1+cu118 • xformers: 0.0.20 • gradio: 3.41.0

catboxanon commented 1 year ago

Since reports here indicate this was fixed I'm gonna go ahead and close, this but feel free to re-open or open a new issue if the same issue still occurs as of 1.6.0-RC with the same setup as OP.

freecoderwaifu commented 12 months ago

I sort of forgot about this but yes, can also confirm it is fixed, thank you 👍.

dancemanUK commented 9 months ago

RuntimeError: Input type (float) and bias type (struct c10::Half) should be the same 用时:15.9 sec.

API • Github • Gradio • Startup profile • 重载 UI version: v1.6.0-400-gf0f100e6 • python: 3.10.11 • torch: 2.0.1+cu118 • xformers: 0.0.21 • gradio: 3.41.2 • checkpoint: 0364819244

sebaxakerhtc commented 6 months ago

Still happens with Tiled Diffusion

AUTOMATIC1111 / stable-diffusion-webui