Sygil-Dev / sygil-webui

Stable Diffusion web UI
GNU Affero General Public License v3.0
7.85k stars 884 forks source link

[BUG] Optimized mode is broken #319

Closed DenkingOfficial closed 1 year ago

DenkingOfficial commented 1 year ago

Describe the bug Optimized mode is not working I'm getting error while generation after downloading latest version from master branch

To Reproduce Steps to reproduce the behavior:

  1. Add "--optimized" flag in relauncher.py
  2. Start webui
  3. Specify prompt and click "Generate"
  4. See error in logs
    0%|                                                                                           | 0/50 [00:00<?, ?it/s]
    !!Runtime error (txt2img)!!
    Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
    exiting...calling os._exit(0)
    Relauncher: Process is ending. Relaunching in 0.5s...

Desktop:

athu16 commented 1 year ago

It's happening after commit fe17340 image I believe with optimized mode the model isn't getting transferred to the GPU at all now, hence the multiple devices error

DenkingOfficial commented 1 year ago

Thanks so much, changed code by myself and now it works like a charm! Anyways, waiting for fix by @hlky

hlky commented 1 year ago

@athu16 @DenkingOfficial this was added a couple of hours ago it was supposed to be a fix for oom someone said, will revert it, sorry, I can only go off what optimized users are saying, I dont have a 4gb card

oobabooga commented 1 year ago

Works for me, I could not reproduce this bug... If model.cuda() is called I get OOM and the script doesn't even launch.

athu16 commented 1 year ago

@oobabooga are you using the optimised switch?

oobabooga commented 1 year ago

@athu16 yes

model.cuda() is also never called on the basujindal version: https://github.com/basujindal/stable-diffusion/blob/main/optimizedSD/optimized_txt2img.py

throwaway-mezzo-mix commented 1 year ago

It's happening after commit fe17340 image I believe with optimized mode the model isn't getting transferred to the GPU at all now, hence the multiple devices error

i tried reverting that change locally too but i'm sharing @oobabooga's experience in that it doesn't even launch then.

AscendedGravity commented 1 year ago

Same error occurs per OP when using either the optimized tag or optimized-turbo.

GTX 1070 8gb

throwaway-mezzo-mix commented 1 year ago

I've taken another look at that section and that commit seems to make very little sense after slightly closer inspection Screenshot 2022-08-30 174624 That "if" could never succeed. (I haven't done much programming in a long time so I hope I'm not being totally stupid.)

hlky commented 1 year ago

Ok so it's working for @DenkingOfficial @athu16 with fe17340 reverted How about @throwaway-mezzo-mix @AscendedGravity have you updated since I reverted that commit?

hlky commented 1 year ago

@oobabooga it doesn't work for you without fe17340 aka .cuda never called as @throwaway-mezzo-mix pointed out

so, what card do you have?

could everyone state their graphics card?

throwaway-mezzo-mix commented 1 year ago

could everyone state their graphics card?

GTX 960 4GB

oobabooga commented 1 year ago

GTX 1650 4GB

Linux mint, running with --precision full --no-half

Same error occurs per OP when using either the optimized tag or optimized-turbo.

GTX 1070 8gb

With 8GB you can fit the entire model into VRAM at once. Optimized version exists to allow 4GB GPUs to run SD.

throwaway-mezzo-mix commented 1 year ago

Also @hlky i've updated my webui.py now and reverting seemed to work, at least for me.

athu16 commented 1 year ago

RTX 2060 Mobile 6GB

AscendedGravity commented 1 year ago

@AscendedGravity have you updated since I reverted that commit?

Did a manual revert of https://github.com/hlky/stable-diffusion-webui/commit/fe173407fe08cf496ef1607bbce3100f21bf4b3e and yes the optimized-turbo tag functions.

With 8GB you can fit the entire model into VRAM at once. Optimized version exists to allow 4GB GPUs to run SD.

While true, I can run larger image sizes and/or larger batch sizes with optimized.

oobabooga commented 1 year ago

If it works for everyone but me this issue can be closed and reopened later if someone else experiences the same issue. Maybe I'm doing something wrong.

throwaway-mezzo-mix commented 1 year ago

Linux mint, running with --precision full --no-half

@oobabooga Just to make sure, were you running with --optimized when testing this change? Because that's the only time it would affect anything. That might explain why your experience is different.

oobabooga commented 1 year ago

were you running with --optimized when testing this change

Yes

throwaway-mezzo-mix commented 1 year ago

Yes

I'm out of ideas, then. Maybe someone else still has an idea?

oobabooga commented 1 year ago

Is it possible that some of you are running code from

https://github.com/hlky/stable-diffusion

and not

https://github.com/hlky/stable-diffusion-webui

?

hlky commented 1 year ago

@oobabooga possibly, I hadn't synced it to main, I have now. Just have to wait to see what people say now.

GeorgeTownsendd commented 1 year ago

Just adding my experience here.

This repo works fine on windows, no issues with any functionality when running optimised mode on a 1050ti 4GB.

On arch, once again with optimised mode, I get OOM error on startup. With the "if not opt.optimised: model.cuda()" block in webui.py, the application starts, but I get the "found at least two devices, cuda:0 and cpu!" error from the OP when I try to run any prompt.

Thought the OS factor may be relevant to someone more knowledgeable than myself.