AUTOMATIC1111 / stable-diffusion-webui

Stable Diffusion web UI
GNU Affero General Public License v3.0
141.67k stars 26.77k forks source link

[Bug]: 每当我结合controlnet想要联合起来使用的时候就会报错ValueError: cannot determine region size; use 4-item box #8654

Closed Arxchibo closed 1 year ago

Arxchibo commented 1 year ago

Is there an existing issue for this?

What happened?

每当我结合controlnet想要联合起来使用的时候就会报错ValueError: cannot determine region size; use 4-item box

Steps to reproduce the problem

  1. Go to ....
  2. Press ....
  3. ...

What should have happened?

正常运行

Commit where the problem happens

运行

What platforms do you use to access the UI ?

No response

What browsers do you use to access the UI ?

No response

Command Line Arguments

ChatGPT request:
I want you to act as a prompt generator. Compose each answer as a visual sentence. Do not write explanations on replies. Format the answers as javascript json arrays with a single string per answer. Return exactly 4 to my question. Answer the questions exactly. Answer the following question:
Take the prompt "a house standing alone surrounded by fog, silenthill fog city, ethereal photography, azahahadid spiral wood house in a rocky landscape, archdaily mir_render Brick_visual" and improve it.

Prompts:
Solitary dwelling shrouded by impenetrable mist
The eerie town of Silent Hill adrift in ethereal fog
A spiraling wooden house by Zaha Hadid set amongst craggy terrain
Brick visuals of Mir Render's architectural feats

Creating 4 image permutations
100%|██████████████████████████████████████████████████████████████████████████████████| 20/20 [00:13<00:00,  1.50it/s]
100%|██████████████████████████████████████████████████████████████████████████████████| 20/20 [00:03<00:00,  5.67it/s]
100%|██████████████████████████████████████████████████████████████████████████████████| 20/20 [00:03<00:00,  5.69it/s]
100%|██████████████████████████████████████████████████████████████████████████████████| 20/20 [00:03<00:00,  5.71it/s]
Total progress: 100%|██████████████████████████████████████████████████████████████████| 80/80 [00:18<00:00,  4.32it/s]
ChatGPT request:
I want you to act as a prompt generator. Compose each answer as a visual sentence. Do not write explanations on replies. Format the answers as javascript json arrays with a single string per answer. Return exactly 4 to my question. Answer the questions exactly. Answer the following question:
Take the prompt "a house standing alone surrounded by fog, silenthill fog city, ethereal photography, azahahadid spiral wood house in a rocky landscape, archdaily mir_render Brick_visual" and improve it.

Prompts:
Lonely house in the midst of thick fog
Eerie photographs capturing the etherealness of silenthill fog city
Spiral wood house surrounded by rock formations, inspired by the creativity of Azahahadid
Brick house in stunning visuals by mir_render and featured on Archdaily

Creating 4 image permutations
Loading preprocessor: depth, model: control_depth-fp16 [400750f6]
Loaded state_dict from [F:\SDlocal\stable-diffusion-webui-master\extensions\sd-webui-controlnet\models\control_depth-fp16.safetensors]
ControlNet model control_depth-fp16 [400750f6] loaded.
100%|██████████████████████████████████████████████████████████████████████████████████| 20/20 [00:05<00:00,  3.78it/s]
Error completing request█████████████▌                                                 | 20/80 [00:04<00:14,  4.17it/s]
Arguments: ('task(64w84leo67tt0iq)', 'a house standing alone surrounded by fog, silenthill fog city, ethereal photography, azahahadid spiral wood house in a rocky landscape, archdaily mir_render Brick_visual', '', [], 20, 0, False, False, 1, 1, 7, -1.0, -1.0, 0, 0, 0, False, 512, 512, False, 0.7, 2, 'Latent', 0, 0, 0, [], 4, False, False, 'LoRA', 'None', 1, 1, 'LoRA', 'None', 1, 1, 'LoRA', 'None', 1, 1, 'LoRA', 'None', 1, 1, 'LoRA', 'None', 1, 1, 'Refresh models', True, 'depth', 'control_depth-fp16 [400750f6]', 1, {'image': array([[[226, 240, 253],
        [226, 240, 253],
        [226, 240, 253],
        ...,
        [226, 240, 253],
        [226, 240, 253],
        [226, 240, 253]],

       [[226, 240, 253],
        [226, 240, 253],
        [226, 240, 253],
        ...,
        [226, 240, 253],
        [226, 240, 253],
        [226, 240, 253]],

       [[226, 240, 253],
        [226, 240, 253],
        [226, 240, 253],
        ...,
        [226, 240, 253],
        [226, 240, 253],
        [226, 240, 253]],

       ...,

       [[253, 255, 245],
        [254, 255, 246],
        [253, 255, 247],
        ...,
        [255, 255, 251],
        [255, 255, 251],
        [255, 255, 251]],

       [[252, 254, 249],
        [253, 255, 250],
        [254, 255, 253],
        ...,
        [254, 255, 250],
        [247, 248, 243],
        [247, 248, 242]],

       [[252, 254, 253],
        [254, 255, 255],
        [254, 255, 255],
        ...,
        [255, 255, 251],
        [255, 255, 251],
        [255, 255, 250]]], dtype=uint8), 'mask': array([[[  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        ...,
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255]],

       [[  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        ...,
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255]],

       [[  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        ...,
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255]],

       ...,

       [[  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        ...,
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255]],

       [[  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        ...,
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255]],

       [[  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        ...,
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255]]], dtype=uint8)}, False, 'Scale to Fit (Inner Fit)', False, False, 384, 64, 64, 1, False, False, False, 'positive', 'comma', 0, False, False, '', 1, '', 0, '', 0, '', True, False, False, False, 0, 'Take the prompt {prompt} and improve it.', 2, 4.0, False, '', '', True, False, False, False) {}
Traceback (most recent call last):
  File "F:\SDlocal\stable-diffusion-webui-master\modules\call_queue.py", line 56, in f
    res = list(func(*args, **kwargs))
  File "F:\SDlocal\stable-diffusion-webui-master\modules\call_queue.py", line 37, in f
    res = func(*args, **kwargs)
  File "F:\SDlocal\stable-diffusion-webui-master\modules\txt2img.py", line 53, in txt2img
    processed = modules.scripts.scripts_txt2img.run(p, *args)
  File "F:\SDlocal\stable-diffusion-webui-master\modules\scripts.py", line 376, in run
    processed = script.run(p, *script_args)
  File "F:\SDlocal\stable-diffusion-webui-master\extensions\stable-diffusion-webui-chatgpt-utilities\scripts\prompt_chatgpt.py", line 182, in run
    temp_grid = images.image_grid(proc.images, p.batch_size)
  File "F:\SDlocal\stable-diffusion-webui-master\modules\images.py", line 52, in image_grid
    grid.paste(img, box=(i % params.cols * w, i // params.cols * h))
  File "F:\SDlocal\stable-diffusion-webui-master\venv\lib\site-packages\PIL\Image.py", line 1711, in paste
    raise ValueError(msg)
ValueError: cannot determine region size; use 4-item box

100%|██████████████████████████████████████████████████████████████████████████████████| 20/20 [00:05<00:00,  3.86it/s]
Total progress:  50%|█████████████████████████████████                                 | 40/80 [00:45<00:45,  1.13s/it]

List of extensions

controlnet

Console logs

PYTHON: can't open file 'F:\\SDlocal\\stable-diffusion-webui-master\\=鈥淐:\\Users\\MSI\\AppData\\Local\\Programs\\Python\\Python310\\python.exe鈥?set': [Errno 22] Invalid argument
venv "F:\SDlocal\stable-diffusion-webui-master\venv\Scripts\Python.exe"
Python 3.10.10 (tags/v3.10.10:aad5f6a, Feb  7 2023, 17:20:36) [MSC v.1929 64 bit (AMD64)]
Commit hash: <none>
Installing requirements for Web UI

#######################################################################################################
Initializing Dreambooth
If submitting an issue on github, please provide the below text for debugging purposes:

Python revision: 3.10.10 (tags/v3.10.10:aad5f6a, Feb  7 2023, 17:20:36) [MSC v.1929 64 bit (AMD64)]
Dreambooth revision: 43ae9d55531004f1dedaea7ac2443e9b16739913
SD-WebUI revision:

Checking Dreambooth requirements...
[+] bitsandbytes version 0.35.0 installed.
[+] diffusers version 0.10.2 installed.
[+] transformers version 4.25.1 installed.
[ ] xformers version N/A installed.
[+] torch version 1.13.1+cu117 installed.
[+] torchvision version 0.14.1+cu117 installed.

#######################################################################################################

Launching Web UI with arguments:
No module 'xformers'. Proceeding without it.
[AddNet] Updating model hashes...
0it [00:00, ?it/s]
[AddNet] Updating model hashes...
0it [00:00, ?it/s]
SD-Webui API layer loaded
[text2prompt] Following databases are available:
    all-mpnet-base-v2 : danbooru_strict
Loading weights [13249548d8] from F:\SDlocal\stable-diffusion-webui-master\models\Stable-diffusion\AARG_render-mix_sd-v.15_12000.ckpt
Creating model from config: F:\SDlocal\stable-diffusion-webui-master\configs\v1-inference.yaml
LatentDiffusion: Running in eps-prediction mode
DiffusionWrapper has 859.52 M params.
Applying cross attention optimization (Doggettx).
Textual inversion embeddings loaded(0):
Model loaded in 36.7s (load weights from disk: 23.7s, create model: 0.8s, apply weights to model: 1.4s, apply half(): 1.2s, move model to device: 1.0s, load textual inversion embeddings: 8.4s).
[text2prompt] Loading database with name "all-mpnet-base-v2 : danbooru_strict"...
[text2prompt] Database loaded
Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.
ChatGPT request:
I want you to act as a prompt generator. Compose each answer as a visual sentence. Do not write explanations on replies. Format the answers as javascript json arrays with a single string per answer. Return exactly 4 to my question. Answer the questions exactly. Answer the following question:
Take the prompt "a house standing alone surrounded by fog, silenthill fog city, ethereal photography, azahahadid spiral wood house in a rocky landscape, archdaily mir_render Brick_visual" and improve it.

Prompts:
Solitary dwelling shrouded by impenetrable mist
The eerie town of Silent Hill adrift in ethereal fog
A spiraling wooden house by Zaha Hadid set amongst craggy terrain
Brick visuals of Mir Render's architectural feats

Creating 4 image permutations
100%|██████████████████████████████████████████████████████████████████████████████████| 20/20 [00:13<00:00,  1.50it/s]
100%|██████████████████████████████████████████████████████████████████████████████████| 20/20 [00:03<00:00,  5.67it/s]
100%|██████████████████████████████████████████████████████████████████████████████████| 20/20 [00:03<00:00,  5.69it/s]
100%|██████████████████████████████████████████████████████████████████████████████████| 20/20 [00:03<00:00,  5.71it/s]
Total progress: 100%|██████████████████████████████████████████████████████████████████| 80/80 [00:18<00:00,  4.32it/s]
ChatGPT request:
I want you to act as a prompt generator. Compose each answer as a visual sentence. Do not write explanations on replies. Format the answers as javascript json arrays with a single string per answer. Return exactly 4 to my question. Answer the questions exactly. Answer the following question:
Take the prompt "a house standing alone surrounded by fog, silenthill fog city, ethereal photography, azahahadid spiral wood house in a rocky landscape, archdaily mir_render Brick_visual" and improve it.

Prompts:
Lonely house in the midst of thick fog
Eerie photographs capturing the etherealness of silenthill fog city
Spiral wood house surrounded by rock formations, inspired by the creativity of Azahahadid
Brick house in stunning visuals by mir_render and featured on Archdaily

Creating 4 image permutations
Loading preprocessor: depth, model: control_depth-fp16 [400750f6]
Loaded state_dict from [F:\SDlocal\stable-diffusion-webui-master\extensions\sd-webui-controlnet\models\control_depth-fp16.safetensors]
ControlNet model control_depth-fp16 [400750f6] loaded.
100%|██████████████████████████████████████████████████████████████████████████████████| 20/20 [00:05<00:00,  3.78it/s]
Error completing request█████████████▌                                                 | 20/80 [00:04<00:14,  4.17it/s]
Arguments: ('task(64w84leo67tt0iq)', 'a house standing alone surrounded by fog, silenthill fog city, ethereal photography, azahahadid spiral wood house in a rocky landscape, archdaily mir_render Brick_visual', '', [], 20, 0, False, False, 1, 1, 7, -1.0, -1.0, 0, 0, 0, False, 512, 512, False, 0.7, 2, 'Latent', 0, 0, 0, [], 4, False, False, 'LoRA', 'None', 1, 1, 'LoRA', 'None', 1, 1, 'LoRA', 'None', 1, 1, 'LoRA', 'None', 1, 1, 'LoRA', 'None', 1, 1, 'Refresh models', True, 'depth', 'control_depth-fp16 [400750f6]', 1, {'image': array([[[226, 240, 253],
        [226, 240, 253],
        [226, 240, 253],
        ...,
        [226, 240, 253],
        [226, 240, 253],
        [226, 240, 253]],

       [[226, 240, 253],
        [226, 240, 253],
        [226, 240, 253],
        ...,
        [226, 240, 253],
        [226, 240, 253],
        [226, 240, 253]],

       [[226, 240, 253],
        [226, 240, 253],
        [226, 240, 253],
        ...,
        [226, 240, 253],
        [226, 240, 253],
        [226, 240, 253]],

       ...,

       [[253, 255, 245],
        [254, 255, 246],
        [253, 255, 247],
        ...,
        [255, 255, 251],
        [255, 255, 251],
        [255, 255, 251]],

       [[252, 254, 249],
        [253, 255, 250],
        [254, 255, 253],
        ...,
        [254, 255, 250],
        [247, 248, 243],
        [247, 248, 242]],

       [[252, 254, 253],
        [254, 255, 255],
        [254, 255, 255],
        ...,
        [255, 255, 251],
        [255, 255, 251],
        [255, 255, 250]]], dtype=uint8), 'mask': array([[[  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        ...,
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255]],

       [[  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        ...,
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255]],

       [[  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        ...,
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255]],

       ...,

       [[  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        ...,
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255]],

       [[  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        ...,
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255]],

       [[  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        ...,
        [  0,   0,   0, 255],
        [  0,   0,   0, 255],
        [  0,   0,   0, 255]]], dtype=uint8)}, False, 'Scale to Fit (Inner Fit)', False, False, 384, 64, 64, 1, False, False, False, 'positive', 'comma', 0, False, False, '', 1, '', 0, '', 0, '', True, False, False, False, 0, 'Take the prompt {prompt} and improve it.', 2, 4.0, False, '', '', True, False, False, False) {}
Traceback (most recent call last):
  File "F:\SDlocal\stable-diffusion-webui-master\modules\call_queue.py", line 56, in f
    res = list(func(*args, **kwargs))
  File "F:\SDlocal\stable-diffusion-webui-master\modules\call_queue.py", line 37, in f
    res = func(*args, **kwargs)
  File "F:\SDlocal\stable-diffusion-webui-master\modules\txt2img.py", line 53, in txt2img
    processed = modules.scripts.scripts_txt2img.run(p, *args)
  File "F:\SDlocal\stable-diffusion-webui-master\modules\scripts.py", line 376, in run
    processed = script.run(p, *script_args)
  File "F:\SDlocal\stable-diffusion-webui-master\extensions\stable-diffusion-webui-chatgpt-utilities\scripts\prompt_chatgpt.py", line 182, in run
    temp_grid = images.image_grid(proc.images, p.batch_size)
  File "F:\SDlocal\stable-diffusion-webui-master\modules\images.py", line 52, in image_grid
    grid.paste(img, box=(i % params.cols * w, i // params.cols * h))
  File "F:\SDlocal\stable-diffusion-webui-master\venv\lib\site-packages\PIL\Image.py", line 1711, in paste
    raise ValueError(msg)
ValueError: cannot determine region size; use 4-item box

100%|██████████████████████████████████████████████████████████████████████████████████| 20/20 [00:05<00:00,  3.86it/s]
Total progress:  50%|█████████████████████████████████                                 | 40/80 [00:45<00:45,  1.13s/it]

Additional information

No response

PestToast commented 1 year ago

You did not include any steps to reproduce the problem

dtlnor commented 1 year ago

not enough information