d8ahazard / sd_smartprocess

Smart Pre-processing extension for Stable Diffusion
196 stars 19 forks source link

error when running CLIP #39

Open Omegadarling opened 1 year ago

Omegadarling commented 1 year ago

I'm getting this traceback with errors when running the CLIP captioning:

Traceback (most recent call last):
  File "C:\Automatic1111\extensions\sd_smartprocess\smartprocess.py", line 360, in preprocess
    full_caption = build_caption(img) if caption else None
  File "C:\Automatic1111\extensions\sd_smartprocess\smartprocess.py", line 159, in build_caption
    tags = clip_interrogator.interrogate(img, max_flavors=clip_max_flavors)
  File "C:\Automatic1111\extensions\sd_smartprocess\clipinterrogator.py", line 193, in interrogate
    caption = self.generate_caption(image)
  File "C:\Automatic1111\extensions\sd_smartprocess\clipinterrogator.py", line 174, in generate_caption
    caption = self.blip_model.generate(
  File "C:\Automatic1111\repositories\BLIP\models\blip.py", line 156, in generate
    outputs = self.text_decoder.generate(input_ids=input_ids,
  File "C:\Automatic1111\venv\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "C:\Automatic1111\venv\lib\site-packages\transformers\generation\utils.py", line 1490, in generate
    return self.beam_search(
  File "C:\Automatic1111\venv\lib\site-packages\transformers\generation\utils.py", line 2749, in beam_search
    outputs = self(
  File "C:\Automatic1111\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "C:\Automatic1111\repositories\BLIP\models\med.py", line 886, in forward
    outputs = self.bert(
  File "C:\Automatic1111\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "C:\Automatic1111\repositories\BLIP\models\med.py", line 781, in forward
    encoder_outputs = self.encoder(
  File "C:\Automatic1111\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "C:\Automatic1111\repositories\BLIP\models\med.py", line 445, in forward
    layer_outputs = layer_module(
  File "C:\Automatic1111\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "C:\Automatic1111\repositories\BLIP\models\med.py", line 361, in forward
    cross_attention_outputs = self.crossattention(
  File "C:\Automatic1111\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "C:\Automatic1111\repositories\BLIP\models\med.py", line 277, in forward
    self_outputs = self.self(
  File "C:\Automatic1111\venv\lib\site-packages\torch\nn\modules\module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "C:\Automatic1111\repositories\BLIP\models\med.py", line 178, in forward
    attention_scores = torch.matmul(query_layer, key_layer.transpose(-1, -2))
RuntimeError: The size of tensor a (8) must match the size of tensor b (64) at non-singleton dimension 0
VanishX commented 1 year ago

facing the exactly same errors here, but I am running Ubuntu. with the SD-WebUI revision: a9fed7c364061ae6efb37f797b6b522cb3cf7aa2 which is currently latest one. Only thing unusual I did which is I added "fairscale" into requirements.txt, because it's keep showing error about no module fairscale.

Omegadarling commented 1 year ago

@VanishX This was an error with another script (stable-diffusion-webui-dataset-tag-editor) and another user suggested that the Dreambooth extension may be the problem:

"this may be a conflict with the new dreambooth extension which requires transformers~=4.27.1, which already conflicts with the clip interrogator (requires transformers~=4.26.1), which gives the same error as above."

VanishX commented 1 year ago

Thanks for the info, I’m using the latest dreambooth extension, I will look into it and try to figure out how to bypass the conflicts.

hdvrai commented 1 year ago

Same issue as above

asgeorges commented 1 year ago

Hey! Anyone figured this out yet?

@VanishX - not sure if you figured out your issue, but fairscale you should just be able to run a

pip install fairscale then reboot system to resolve at least that error. Might be something funny going on by trying to reroute through requirements.txt

asgeorges commented 1 year ago

Getting same error as @Omegadarling

Traceback (most recent call last): File "/workspace/stable-diffusion-webui/extensions/sd_smartprocess/smartprocess.py", line 269, in preprocess short_caption = clip_interrogator.interrogate(img, short=True) File "/workspace/stable-diffusion-webui/extensions/sd_smartprocess/clipinterrogator.py", line 193, in interrogate caption = self.generate_caption(image) File "/workspace/stable-diffusion-webui/extensions/sd_smartprocess/clipinterrogator.py", line 174, in generate_caption caption = self.blip_model.generate( File "/workspace/stable-diffusion-webui/repositories/BLIP/models/blip.py", line 156, in generate outputs = self.text_decoder.generate(input_ids=input_ids, File "/workspace/venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "/workspace/venv/lib/python3.10/site-packages/transformers/generation/utils.py", line 1604, in generate return self.beam_search( File "/workspace/venv/lib/python3.10/site-packages/transformers/generation/utils.py", line 2902, in beam_search outputs = self( File "/workspace/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/workspace/stable-diffusion-webui/repositories/BLIP/models/med.py", line 886, in forward outputs = self.bert( File "/workspace/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/workspace/stable-diffusion-webui/repositories/BLIP/models/med.py", line 781, in forward encoder_outputs = self.encoder( File "/workspace/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/workspace/stable-diffusion-webui/repositories/BLIP/models/med.py", line 445, in forward layer_outputs = layer_module( File "/workspace/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/workspace/stable-diffusion-webui/repositories/BLIP/models/med.py", line 361, in forward cross_attention_outputs = self.crossattention( File "/workspace/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/workspace/stable-diffusion-webui/repositories/BLIP/models/med.py", line 277, in forward self_outputs = self.self( File "/workspace/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File "/workspace/stable-diffusion-webui/repositories/BLIP/models/med.py", line 178, in forward attention_scores = torch.matmul(query_layer, key_layer.transpose(-1, -2)) RuntimeError: The size of tensor a (8) must match the size of tensor b (64) at non-singleton dimension 0