AUTOMATIC1111 / stable-diffusion-webui

Stable Diffusion web UI
GNU Affero General Public License v3.0
141.66k stars 26.77k forks source link

[Bug]: deepdanbooru interrogation doesn't work with `--precision full` #11119

Closed Ryato003 closed 1 year ago

Ryato003 commented 1 year ago

Is there an existing issue for this?

What happened?

When I try to generate a Promt from an image, the process stops in the middle and gives me a prompt cut in half or even no output at all This is with DeepBooru (already installed --deepdanbooru)

Traceback (most recent call last): File "C:\AI\stable-diffusion-webui\venv\lib\site-packages\gradio\routes.py", line 422, in run_predict output = await app.get_blocks().process_api( File "C:\AI\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 1323, in process_api result = await self.call_function( File "C:\AI\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 1051, in call_function prediction = await anyio.to_thread.run_sync( File "C:\AI\stable-diffusion-webui\venv\lib\site-packages\anyio\to_thread.py", line 33, in run_sync return await get_asynclib().run_sync_in_worker_thread( File "C:\AI\stable-diffusion-webui\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 877, in run_sync_in_worker_thread return await future File "C:\AI\stable-diffusion-webui\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 807, in run result = context.run(func, *args) File "C:\AI\stable-diffusion-webui\modules\ui.py", line 1007, in <lambda> fn=lambda *args: process_interrogate(interrogate_deepbooru, *args), File "C:\AI\stable-diffusion-webui\modules\ui.py", line 137, in process_interrogate return [interrogation_function(ii_singles[mode]), None] File "C:\AI\stable-diffusion-webui\modules\ui.py", line 164, in interrogate_deepbooru prompt = deepbooru.model.tag(image) File "C:\AI\stable-diffusion-webui\modules\deepbooru.py", line 44, in tag res = self.tag_multi(pil_image) File "C:\AI\stable-diffusion-webui\modules\deepbooru.py", line 61, in tag_multi y = self.model(x)[0].detach().cpu().numpy() File "C:\AI\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "C:\AI\stable-diffusion-webui\modules\deepbooru_model.py", line 201, in forward t_360 = self.n_Conv_0(t_359_padded.to(self.n_Conv_0.bias.dtype) if devices.unet_needs_upcast else t_359_padded) File "C:\AI\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "C:\AI\stable-diffusion-webui\extensions-builtin\Lora\lora.py", line 415, in lora_Conv2d_forward return torch.nn.Conv2d_forward_before_lora(self, input) File "C:\AI\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\conv.py", line 463, in forward return self._conv_forward(input, self.weight, self.bias) File "C:\AI\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\conv.py", line 459, in _conv_forward return F.conv2d(input, weight, bias, self.stride, RuntimeError: Input type (float) and bias type (struct c10::Half) should be the same

This appened also with CLIP

Traceback (most recent call last): File "C:\AI\stable-diffusion-webui\modules\interrogate.py", line 206, in interrogate image_features = self.clip_model.encode_image(clip_image).type(self.dtype) File "C:\AI\stable-diffusion-webui\venv\lib\site-packages\clip\model.py", line 341, in encode_image return self.visual(image.type(self.dtype)) File "C:\AI\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "C:\AI\stable-diffusion-webui\venv\lib\site-packages\clip\model.py", line 229, in forward x = self.ln_pre(x) File "C:\AI\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "C:\AI\stable-diffusion-webui\venv\lib\site-packages\clip\model.py", line 162, in forward ret = super().forward(x.type(torch.float32)) File "C:\AI\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\normalization.py", line 190, in forward return F.layer_norm( File "C:\AI\stable-diffusion-webui\venv\lib\site-packages\torch\nn\functional.py", line 2515, in layer_norm return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled) RuntimeError: expected scalar type Float but found Half

Steps to reproduce the problem

  1. Insert --deepdanbooru in the .bat file
  2. Run the .bat file
  3. Go to img2img and insert an image
  4. Select 'interrogate CLIP' and take a look CMD window
  5. Select 'interrogate DanBooru' and take a look CMD window

What should have happened?

I expected that as output in the appropriate section the output prompt. Instead it gives me a little red icon that says "error" and in the CMD there is an error message

Commit where the problem happens

webui_user.bat

What Python version are you running on ?

Python 3.10.x

What platforms do you use to access the UI ?

Windows

What device are you running WebUI on?

Nvidia GPUs (RTX 20 above)

What browsers do you use to access the UI ?

Google Chrome

Command Line Arguments

set COMMANDLINE_ARGS= --xformers --autolaunch --precision full --deepdanbooru

List of extensions

No

Console logs

Already up to date.
venv "C:\AI\stable-diffusion-webui\venv\Scripts\Python.exe"
Python 3.10.6 (tags/v3.10.6:9c7b4bd, Aug  1 2022, 21:53:49) [MSC v.1932 64 bit (AMD64)]
Version: v1.3.2
Commit hash: baf6946e06249c5af9851c60171692c44ef633e0
Installing requirements
Launching Web UI with arguments: --xformers --autolaunch --precision full --deepdanbooru
Loading weights [6ce0161689] from C:\AI\stable-diffusion-webui\models\Stable-diffusion\v1-5-pruned-emaonly.safetensors
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
Creating model from config: C:\AI\stable-diffusion-webui\configs\v1-inference.yaml
LatentDiffusion: Running in eps-prediction mode
DiffusionWrapper has 859.52 M params.
Startup time: 10.6s (import torch: 3.0s, import gradio: 2.2s, import ldm: 0.8s, other imports: 2.2s, load scripts: 1.2s, create ui: 0.5s, gradio launch: 0.5s).
Applying optimization: xformers... done.
Textual inversion embeddings loaded(0):
Model loaded in 8.8s (load weights from disk: 0.8s, create model: 0.5s, apply weights to model: 5.5s, apply half(): 0.5s, move model to device: 0.5s, load textual inversion embeddings: 1.0s).
load checkpoint from C:\AI\stable-diffusion-webui\models\BLIP\model_base_caption_capfilt_large.pth
Error interrogating
Traceback (most recent call last):
  File "C:\AI\stable-diffusion-webui\modules\interrogate.py", line 206, in interrogate
    image_features = self.clip_model.encode_image(clip_image).type(self.dtype)
  File "C:\AI\stable-diffusion-webui\venv\lib\site-packages\clip\model.py", line 341, in encode_image
    return self.visual(image.type(self.dtype))
  File "C:\AI\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\AI\stable-diffusion-webui\venv\lib\site-packages\clip\model.py", line 229, in forward
    x = self.ln_pre(x)
  File "C:\AI\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\AI\stable-diffusion-webui\venv\lib\site-packages\clip\model.py", line 162, in forward
    ret = super().forward(x.type(torch.float32))
  File "C:\AI\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\normalization.py", line 190, in forward
    return F.layer_norm(
  File "C:\AI\stable-diffusion-webui\venv\lib\site-packages\torch\nn\functional.py", line 2515, in layer_norm
    return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)
RuntimeError: expected scalar type Float but found Half

Traceback (most recent call last):
  File "C:\AI\stable-diffusion-webui\venv\lib\site-packages\gradio\routes.py", line 422, in run_predict
    output = await app.get_blocks().process_api(
  File "C:\AI\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 1323, in process_api
    result = await self.call_function(
  File "C:\AI\stable-diffusion-webui\venv\lib\site-packages\gradio\blocks.py", line 1051, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "C:\AI\stable-diffusion-webui\venv\lib\site-packages\anyio\to_thread.py", line 33, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "C:\AI\stable-diffusion-webui\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 877, in run_sync_in_worker_thread
    return await future
  File "C:\AI\stable-diffusion-webui\venv\lib\site-packages\anyio\_backends\_asyncio.py", line 807, in run
    result = context.run(func, *args)
  File "C:\AI\stable-diffusion-webui\modules\ui.py", line 1007, in <lambda>
    fn=lambda *args: process_interrogate(interrogate_deepbooru, *args),
  File "C:\AI\stable-diffusion-webui\modules\ui.py", line 137, in process_interrogate
    return [interrogation_function(ii_singles[mode]), None]
  File "C:\AI\stable-diffusion-webui\modules\ui.py", line 164, in interrogate_deepbooru
    prompt = deepbooru.model.tag(image)
  File "C:\AI\stable-diffusion-webui\modules\deepbooru.py", line 44, in tag
    res = self.tag_multi(pil_image)
  File "C:\AI\stable-diffusion-webui\modules\deepbooru.py", line 61, in tag_multi
    y = self.model(x)[0].detach().cpu().numpy()
  File "C:\AI\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\AI\stable-diffusion-webui\modules\deepbooru_model.py", line 201, in forward
    t_360 = self.n_Conv_0(t_359_padded.to(self.n_Conv_0.bias.dtype) if devices.unet_needs_upcast else t_359_padded)
  File "C:\AI\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\AI\stable-diffusion-webui\extensions-builtin\Lora\lora.py", line 415, in lora_Conv2d_forward
    return torch.nn.Conv2d_forward_before_lora(self, input)
  File "C:\AI\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\conv.py", line 463, in forward
    return self._conv_forward(input, self.weight, self.bias)
  File "C:\AI\stable-diffusion-webui\venv\lib\site-packages\torch\nn\modules\conv.py", line 459, in _conv_forward
    return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Input type (float) and bias type (struct c10::Half) should be the same

Additional information

No response

akx commented 1 year ago

Can you try without --precision full?

Ryato003 commented 1 year ago

uh, without --precision full it works, Thx

akx commented 1 year ago

Great!