Closed almfahd41 closed 1 year ago
i found a fix just added this in the colab
%cd /content
%env TF_CPP_MIN_LOG_LEVEL=1
!apt -y update -qq
!wget https://github.com/camenduru/gperftools/releases/download/v1.0/libtcmalloc_minimal.so.4 -O /content/libtcmalloc_minimal.so.4
%env LD_PRELOAD=/content/libtcmalloc_minimal.so.4
Thanks for fix, I will test it too. I am having some other issues with running app on colab for last 2 or 3 weeks, so not sure if this was the only problem. Also I haven't tested all combinations of optimizations, so if something does not work, it's better to turn off some of the optimizations and check if the error still persists.
optimizations: xformers for prior;sequential CPU offloading for prior;xformers for decoder;sequential CPU offloading for decoder;attention slicing for decoder: slice_size=max seed generated: 54955802784 0% 0/25 [00:00<?, ?it/s] Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/gradio/routes.py", line 439, in run_predict output = await app.get_blocks().process_api( File "/usr/local/lib/python3.10/dist-packages/gradio/blocks.py", line 1384, in process_api result = await self.call_function( File "/usr/local/lib/python3.10/dist-packages/gradio/blocks.py", line 1089, in call_function prediction = await anyio.to_thread.run_sync( File "/usr/local/lib/python3.10/dist-packages/anyio/to_thread.py", line 33, in run_sync return await get_asynclib().run_sync_in_worker_thread( File "/usr/local/lib/python3.10/dist-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread return await future File "/usr/local/lib/python3.10/dist-packages/anyio/_backends/_asyncio.py", line 807, in run result = context.run(func, args) File "/usr/local/lib/python3.10/dist-packages/gradio/utils.py", line 700, in wrapper response = f(args, kwargs) File "/content/kubin/src/ui_blocks/t2i.py", line 240, in generate return generate_fn(params) File "/content/kubin/src/webui.py", line 34, in
generate_fn=lambda params: kubin.model.t2i(params),
File "/content/kubin/src/models/model_diffusers22/model_22.py", line 116, in t2i
return self.t2i_cnet(params)
File "/content/kubin/src/models/model_diffusers22/model_22.py", line 457, in t2i_cnet
image_embeds, zero_embeds = prior(
File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, *kwargs)
File "/usr/local/lib/python3.10/dist-packages/diffusers/pipelines/kandinsky2_2/pipeline_kandinsky2_2_prior.py", line 494, in call
predicted_image_embedding = self.prior(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(args, kwargs)
File "/usr/local/lib/python3.10/dist-packages/diffusers/models/prior_transformer.py", line 346, in forward
hidden_states = block(hidden_states, attention_mask=attention_mask)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, *kwargs)
File "/usr/local/lib/python3.10/dist-packages/diffusers/models/attention.py", line 154, in forward
attn_output = self.attn1(
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/diffusers/models/attention_processor.py", line 321, in forward
return self.processor(
File "/usr/local/lib/python3.10/dist-packages/diffusers/models/attention_processor.py", line 1046, in call
hidden_states = xformers.ops.memory_efficient_attention(
File "/usr/local/lib/python3.10/dist-packages/xformers/ops/fmha/init.py", line 192, in memory_efficient_attention
return _memory_efficient_attention(
File "/usr/local/lib/python3.10/dist-packages/xformers/ops/fmha/init.py", line 290, in _memory_efficient_attention
return _memory_efficient_attention_forward(
File "/usr/local/lib/python3.10/dist-packages/xformers/ops/fmha/init.py", line 308, in _memory_efficient_attention_forward
_ensure_op_supports_or_raise(ValueError, "memory_efficient_attention", op, inp)
File "/usr/local/lib/python3.10/dist-packages/xformers/ops/fmha/dispatch.py", line 45, in _ensure_op_supports_or_raise
raise exc_type(
ValueError: Operator
memory_efficient_attention
does not support inputs: query : shape=(64, 81, 1, 64) (torch.float16) key : shape=(64, 81, 1, 64) (torch.float16) value : shape=(64, 81, 1, 64) (torch.float16) attn_bias : <class 'torch.Tensor'> p : 0.0flshattF
is not supported because: attn_bias type is <class 'torch.Tensor'>