Closed system1system2 closed 1 year ago
--force-enable-xformers obsolete please use --xformers
Thanks for the quick reply, @camenduru
The issue is not related to the xformers module (even if it appeared in my quoted output).
Even before the flag --force-enable-xformers
was rendered obsolete, I had this issue.
And now, after updating the notebook with the new flag --xformers
, I still have it:
Python 3.8.16 (default, Dec 7 2022, 01:12:13) [GCC 7.5.0] Commit hash: 4af3ca5393151d61363c30eef4965e694eeac15e Installing gfpgan Installing clip Installing open_clip Cloning Stable Diffusion into repositories/stable-diffusion-stability-ai... Cloning Taming Transformers into repositories/taming-transformers... Cloning K-diffusion into repositories/k-diffusion... Cloning CodeFormer into repositories/CodeFormer... Cloning BLIP into repositories/BLIP... Installing requirements for CodeFormer Installing requirements for Web UI Launching Web UI with arguments: --share --xformers Loading config from: /content/stable-diffusion-webui/models/Stable-diffusion/v2-1_768-ema-pruned.yaml LatentDiffusion: Running in v-prediction mode DiffusionWrapper has 865.91 M params. Downloading: 100% 3.94G/3.94G [00:56<00:00, 70.3MB/s] ^C
This issue exclusively happens with this particular notebook (again, I don't have it with, for example, the Analog Diffusion module), and only if I try to mount my Google Drive before running the A1111 installation cell.
this is working https://github.com/camenduru/stable-diffusion-webui-colab/blob/main/stable_diffusion_v2_1_webui_colab.ipynb
please compare your code with stable_diffusion_v2_1_webui_colab.ipynb
wait it is not working with gdrive interesting π
I change nothing from your code. All I do is:
I do exactly these steps with your colab for Analog Diffusion and it works flawlessly, as expected.
I have no clue why there's this difference in behaviour.
stable_diffusion_v2_1_webui_colab | with gdrive (crashed) | without gdrive |
---|---|---|
stable_diffusion_1_5_webui_colab with gdrive
stable_diffusion_v2_1 using too much system ram without gdrive working with gdrive and stable_diffusion_v2_1 not fitting the system ram π¨
if we convert fp32
v2-1_768-ema-pruned.ckpt
5.21 GB
to fp16
5.21/2 GB
probably fits
Had the same problem, figured out a workaround fix β crash the colab right before launching the UI, this will free up the RAM
Do this after downloading the models:
import os
os.kill(os.getpid(), 9)
This will crash the runtime. Now reconnect and run:
%cd /content/stable-diffusion-webui
!python launch.py --share --xformers
thanks @MitPitt β€ good idea π€©
Thanks, @MitPitt, but I still can't make it work.
I split @camenduru's original notebook into multiple cells as in the screenshot. I executed your recommended os.kill
cell.
The environment crashes as expected and reconnects automatically.
Then I proceed launching A1111, but I still run out of system RAM.
What am I doing wrong?
Google drive is taking RAM as well, I had this problem. You will have to download any needed files manually, without mounting the drive. Use this command to download public files from google drive:
!curl -o train_images.zip -L 'https://drive.google.com/uc?export=download&confirm=yes&id=[ID]' # repalce ID
And you can find your file's ID by looking at the share link: https://drive.google.com/file/d/ABCDEFG/view?usp=share_link
Here, the ID is ABCDEFG
hi @system1system2 π I converted to fp16 now 2.58 GB please use this with gdrive
https://huggingface.co/ckpt/stable-diffusion-2-1/resolve/main/v2-1_768-ema-pruned-fp16.ckpt
Thank you so much for converting this. Unfortunately, I still have issues:
I have modified your colab notebook to download the correct file and save it with the old file name, so I don't have to rename the yaml file as well:
!wget https://huggingface.co/ckpt/stable-diffusion-2-1/resolve/main/v2-1_768-ema-pruned-fp16.ckpt -O /content/stable-diffusion-webui/models/Stable-diffusion/v2-1_768-ema-pruned.ckpt !wget https://raw.githubusercontent.com/Stability-AI/stablediffusion/main/configs/stable-diffusion/v2-inference-v.yaml -O /content/stable-diffusion-webui/models/Stable-diffusion/v2-1_768-ema-pruned.yaml
It correctly downloads the half-precision variant (which is saved in the Colab drive as a 2.4G file), but then it insists in loading a 3.4GB file:
and that's where it runs out of memory as usual:
Python 3.8.16 (default, Dec 7 2022, 01:12:13) [GCC 7.5.0] Commit hash: 4af3ca5393151d61363c30eef4965e694eeac15e Installing gfpgan Installing clip Installing open_clip Cloning Stable Diffusion into repositories/stable-diffusion-stability-ai... Cloning Taming Transformers into repositories/taming-transformers... Cloning K-diffusion into repositories/k-diffusion... Cloning CodeFormer into repositories/CodeFormer... Cloning BLIP into repositories/BLIP... Installing requirements for CodeFormer Installing requirements for Web UI Launching Web UI with arguments: --share --xformers A matching Triton is not available, some optimizations will not be enabled. Error caught was: No module named 'triton' Loading config from: /content/stable-diffusion-webui/models/Stable-diffusion/v2-1_768-ema-pruned.yaml LatentDiffusion: Running in v-prediction mode DiffusionWrapper has 865.91 M params. Downloading: 100% 3.94G/3.94G [01:04<00:00, 61.2MB/s] ^C
Also notice that during the process, python raises a new error:
A matching Triton is not available, some optimizations will not be enabled. Error caught was: No module named 'triton'
I don't know if it's important or not as I cannot load the UI to test image generation.
at this point, I am thinking that there may be a memory leak in the code π€
Agree with all above. I tried just installing the WebUI without connecting my drive. It died the same death as described above.
Launching Web UI with arguments: --share --xformers A matching Triton is not available, some optimizations will not be enabled. Error caught was: No module named 'triton' Loading config from: /content/stable-diffusion-webui/models/Stable-diffusion/768-v-ema.yaml LatentDiffusion: Running in v-prediction mode DiffusionWrapper has 865.91 M params. Downloading: 100% 3.94G/3.94G [00:56<00:00, 69.6MB/s] ^C
And I switched to this latest version because I can't fix the "RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu)" on the older version that I used to be able to fix. (Now, none of the suggested edits to the ddpm file work.) I would so dearly love to train more embeddings but I can't seem to find a version that runs for me on Colab (with a paid account.)
Edit: But I did get the WebUI running from midjourney_v4_diffusion_webui_colab.ipynb before attaching the Google Drive. (Now trying to mount the drive does nothing. No pop-up, no error message, no mount.) And also the runtime error about indices is still a problem. I am really sad about this.
Forking and patching the stablediffusion repository of Stability-AI will bring it within 12 GB. Here is a similar Issue. (Translated at DeepL)
https://github.com/ddPn08/automatic1111-colab/issues/16 https://github.com/ddPn08/automatic1111-colab/commit/27484525d2aaf98b8ba75ce955dd553dc2eb3ab3
@thx-pw γγγγγγγ¨γγγγγΎγγ β€ β€
!sed -i -e '''/prepare_environment()/a\ os.system\(f\"""sed -i -e ''\"s/dict()))/dict())).cuda()/g\"'' /content/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/util.py""")''' /content/stable-diffusion-webui/launch.py
@thx-pwγγγγγ‘γγεδ½γγ¦γγΎγ return get_obj_from_str(config["target"])(**config.get("params", dict())).cuda()
γη’Ίθͺγγ γγ
@system1system2 please try this https://github.com/camenduru/stable-diffusion-webui-colab/blob/main/stable_diffusion_v2_1_webui_colab.ipynb
You can do it in one line. That's smarter.
@system1system2 please try this https://github.com/camenduru/stable-diffusion-webui-colab/blob/main/stable_diffusion_v2_1_webui_colab.ipynb
I am running. So far, sed is throwing "no such file or directory" errors (for both sed calls). Edit: but apparently it doesn't matter? I couldn't mount my Google drive, but I just uploaded my training images and am now training an embedding.
hi @MisoSpree π sed is working we are getting this message because we are using sed inside sed before getting the file from repo little trick hehe
sed: can't read /content/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/util.py: No such file or directory
sed: can't read /content/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/models/diffusion/ddpm.py: No such file or directory
Roger that. Ignoring error messages is right up my alley.
Note that when training an embedding, the loss is reported as a NaN:
[Epoch 499: 10/10]loss: nan: 10% 4999/50000 [45:34<6:49:13, 1.83it/s]
And the image put out every N steps is just black. Looks like something is broken still.
oh no π
@camenduru believe or not, it woks (at least for ordinary txt2img generations - I didn't try to train an embedding like @MisoSpree). The sed weird trick worked, but you might want to say something about it in the documentation or you'll have an avalanche of people reporting the same No such file or directory
error that @MisoSpree reported.
Thanks for the patience in fixing this. I'm training without any issues this morning thanks to you.
In my environment, I had no problem learning embedding.
hi @ddPn08 can you train without black example output? please show us how
I tried, and I also got a black output π
I created embedding from the train tab of AUTOMATIC1111 and trained without changing any settings. I tested it on my notebook, so I'll try it on this one too.
I tried, and I also got a black output π
I am glad I am not the only one. Were you seeing loss reported as NaN?
Edit: Just to check, I did this again today. This is in Colab. Today I first connected my Google Drive. (This is different from the last time when I didn't connect the google drive at all.) Then I ran https://github.com/camenduru/stable-diffusion-webui-colab/blob/main/stable_diffusion_v2_1_webui_colab.ipynb. Everything installed. I generated a single text-to-image (which I always do as a test when I get the WebUI open.) That worked fine. Then I created an embedding and ran the training. Still, loss is being reported as NaN and the first output image was all black. Then I stopped the training.
Even after removing the low RAM patch, I am still getting the nan error in learning. So I tried everything and when I changed from SD2.1 to WD1.4e1, the nan error was gone. https://huggingface.co/hakurei/waifu-diffusion-v1-4/tree/main I don't know why.
Just a quick note to let you know, @camenduru, that this new version of the notebook runs out of memory again :)
The problem is the single sed
command, in place of the previous two.
If you replace it with the previous two lines below, the notebook works just fine, including triton installation and the new CivitAI extension:
!sed -i -e '''/prepare_environment()/a\ os.system(f\"""sed -i -e ''\"s/self.logvar\[t\]/self.logvar\[t.item()\]/g\"'' /content/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/models/diffusion/ddpm.py""")''' /content/stable-diffusion-webui/launch.py !sed -i -e '''/prepare_environment()/a\ os.system(f\"""sed -i -e ''\"s/dict()))/dict())).cuda()/g\"'' /content/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/util.py""")''' /content/stable-diffusion-webui/launch.py
I tested it this one and it worked with gdrive I didn't change anything maybe you are getting less ram I got 12.68GB https://github.com/camenduru/stable-diffusion-webui-colab/blob/main/stable_diffusion_v2_1_webui_colab.ipynb
Same amount. Not sure why it works with the two sed lines but fails with a single one. At this point, it's up to you. We can close this issue as is (as it works for me, at least with this specific configuration) or leave it open.
Hi. All your colab notebooks are amazing. Thanks for sharing them with the community.
I have a problem with one of them:
stable_diffusion_v2_1_webui_colab
If I create a new cell to mount my Google Drive and run it before your cell to initialize SD2.1, the initialization interrupts half way and I get this output:
I read that the
^C
interrupt might indicate the system has run out of memory.If I do not run the cell that mounts Google Drive, everything works fine.
Also, and this is where it's strange, if I run another of your Colab notebooks, like analog_diffusion_webui_colab, by running the cell that mounts Google Drive first, everything works fine, too.