Closed jwooldridge234 closed 1 year ago
Not sure if it's related, but I had to manually install "scikit-learn" rather than "sklearn" as the latter isn't accessible on MPS (deprecated).
You are doing the right thing with scikit-learn
! I think the error is related to this issue (https://github.com/songweige/rich-text-to-image/issues/11). Could you check the version of diffusers
package you are using?
I think I might know the error. Are you using pytorch with version >= 2.0?
I think I might know the error. Are you using pytorch with version >= 2.0?
Yeah. Do I need to down/upgrade?
I'm running Diffusers 0.18.2.
Let me see if I can reproduce and fix it on my end first!
Sounds great, thanks!
Hey @jwooldridge234, I was able to reproduce the error and make a fix for using pytoch>=2.0. It should work now. Could you try to update the extension and see if the error is gone or not? Thanks!
@songweige It is! Thanks for getting that working.
Unfortunately hit another error:
text_input {"ops":[{"insert":"A stunning photo of a "},{"attributes":{"link":"http://localhost:7860/file=./extensions/sd-webui-rich-text/cat%20wearing%20a%20red%20hat%20and%20a%20green%20sweater"},"insert":"cat"},{"insert":" and a "},{"attributes":{"link":"http://localhost:7860/file=./extensions/sd-webui-rich-text/blue%20dog%20with%20yellow%20eyes"},"insert":"dog"},{"insert":".\n"}]} 512 512 41 8.5 9 0.45 0 0.3 0.5
/Users/jackwooldridge/StableDiffusion/stable-diffusion-webui/extensions/sd-webui-rich-text/scripts/models/region_diffusion.py:202: FutureWarning: Accessing config attribute `in_channels` directly via 'UNet2DConditionModel' object attribute is deprecated. Please access 'in_channels' over 'UNet2DConditionModel's config object instead, e.g. 'unet.config.in_channels'.
(text_embeddings.shape[0] // 2, self.unet.in_channels, height // 8, width // 8), device=self.device)
time lapses to get attention maps: 341.7217
Traceback (most recent call last):
File "/Users/jackwooldridge/StableDiffusion/stable-diffusion-webui/venv/lib/python3.10/site-packages/gradio/routes.py", line 488, in run_predict
output = await app.get_blocks().process_api(
File "/Users/jackwooldridge/StableDiffusion/stable-diffusion-webui/venv/lib/python3.10/site-packages/gradio/blocks.py", line 1431, in process_api
result = await self.call_function(
File "/Users/jackwooldridge/StableDiffusion/stable-diffusion-webui/venv/lib/python3.10/site-packages/gradio/blocks.py", line 1103, in call_function
prediction = await anyio.to_thread.run_sync(
File "/Users/jackwooldridge/StableDiffusion/stable-diffusion-webui/venv/lib/python3.10/site-packages/anyio/to_thread.py", line 33, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "/Users/jackwooldridge/StableDiffusion/stable-diffusion-webui/venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread
return await future
File "/Users/jackwooldridge/StableDiffusion/stable-diffusion-webui/venv/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 807, in run
result = context.run(func, *args)
File "/Users/jackwooldridge/StableDiffusion/stable-diffusion-webui/venv/lib/python3.10/site-packages/gradio/utils.py", line 707, in wrapper
response = f(*args, **kwargs)
File "/Users/jackwooldridge/StableDiffusion/stable-diffusion-webui/extensions/sd-webui-rich-text/scripts/rich_text_on_tab.py", line 156, in generate
color_obj_masks, segments_vis, token_maps = get_token_maps(model.selfattn_maps, model.crossattn_maps, model.n_maps, run_dir,
File "/Users/jackwooldridge/StableDiffusion/stable-diffusion-webui/extensions/sd-webui-rich-text/scripts/models/utils/attention_utils.py", line 722, in get_token_maps
resized_token_maps = [token_map.unsqueeze(1).repeat(
File "/Users/jackwooldridge/StableDiffusion/stable-diffusion-webui/extensions/sd-webui-rich-text/scripts/models/utils/attention_utils.py", line 723, in <listcomp>
[1, 4, 1, 1]).to(attn_map.dtype).cuda() for token_map in resized_token_maps]
File "/Users/jackwooldridge/StableDiffusion/stable-diffusion-webui/venv/lib/python3.10/site-packages/torch/cuda/__init__.py", line 239, in _lazy_init
raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled
What are the hardware requirements for this method btw? It took 6 minutes to get the attention maps. I have 8GB vram.
@songweige Got past that by casting to "mps" whenever it specified "cuda" and changing the .cuda() calls to .to('mps'). The main problem is how slow it is at this point. It gets past the attention map stage after ~6-7 minutes and then moves to (I assume) the token maps. I finally killed it after it ran for over 20 minutes with no change. Maybe my GPU just isn't powerful enough to handle it?
Glad it works!!! And thanks for debugging the code. I didn't consider the scenario of running it on mac before.
Regarding the time, it sounds like you have done the plain-text-to-image part and moved to the rich-text-to-image part. The time of that part depends on what and how many rich-text formats you use in the text prompt. Usually, font, footnote, and font size are the fastest. Color is a lot slower as it requires some gradient computation. Maybe try to reduce the steps? Also, I'm curious how long does it take for you to run normal Stable Diffusion sampling in your setting?
@songweige I will try running without any rich text and reducing the steps, and see if I can get the gen time down. Thanks!
Going to close this as I believe you resolved all the issues I was having. :)
Hey, just tried running and I got this error when running runwayml/stable-diffusion-v1-5. Happened after restarting as well. I'm running the latest Auto1111 version on MacOS. Let me know if there's anything else I can do to help debug.