Open Firetheft opened 3 months ago
To use xFormers with CUDA 12.1 and PyTorch 2.0.1, you will need to build xFormers from the source. Currently, there is no officially precompiled version of xFormers that supports CUDA 12.1.
To do this, follow these steps:
Clone the xFormers repository:
git clone https://github.com/facebookresearch/xformers.git
Install the necessary dependencies:
ninja
installed, which is required for building. You can install it using pip:
pip install ninja
CMake
and a compatible C++ compiler should also be installed.Build xFormers from the source:
cd xformers
pip install -v -U git+https://github.com/facebookresearch/xformers.git@main#egg=xformers
LD_LIBRARY_PATH
or other environment variables if you encounter issues.Verify the installation:
python -m xformers.info
Troubleshooting:
MAX_JOBS=2
or a similar variable to limit the number of parallel jobs during the build process, which can help avoid memory issues.If you face any difficulties or have specific error messages, consider checking for solutions in the GitHub Issues section of the xFormers repository or on the PyTorch Forums.
pip install torch==2.3.1 torchvision==0.18.1 torchaudio==2.3.1 xformers --index-url https://download.pytorch.org/whl/cu121
install the correct version of xformers with any version of torch and cuda. In this case the version 0.0.27.
Python 3.10.6 (tags/v3.10.6:9c7b4bd, Aug 1 2022, 21:53:49) [MSC v.1932 64 bit (AMD64)] Version: f2.0.1v1.10.1-previous-240-g294416ed Commit hash: 294416ed55cad69eb8a01393854457e35207a2d4 Launching Web UI with arguments: --theme dark --api --ckpt-dir D:/AI/WebUI/models/Stable-diffusion --vae-dir D:/AI/WebUI/models/VAE --embeddings-dir D:/AI/WebUI/embeddings --lora-dir D:/AI/WebUI/models/Lora --gfpgan-models-path D:/AI/WebUI/models/GFPGAN --esrgan-models-path D:/AI/WebUI/models/ESRGAN --controlnet-dir D:/AI/WebUI/models/ControlNe Total VRAM 8188 MB, total RAM 32506 MB pytorch version: 2.3.1+cu121 WARNING:xformers:A matching Triton is not available, some optimizations will not be enabled Traceback (most recent call last): File "D:\AI\webui_forge_cu121_torch231\system\python\lib\site-packages\xformers__init__.py", line 57, in _is_triton_available import triton # noqa ModuleNotFoundError: No module named 'triton' xformers version: 0.0.27 Set vram state to: NORMAL_VRAM Device: cuda:0 NVIDIA GeForce RTX 4060 Laptop GPU : native Hint: your device supports --cuda-malloc for potential speed improvements. VAE dtype preferences: [torch.bfloat16, torch.float32] -> torch.bfloat16 CUDA Using Stream: False D:\AI\webui_forge_cu121_torch231\system\python\lib\site-packages\transformers\utils\hub.py:127: FutureWarning: Using
TRANSFORMERS_CACHE
is deprecated and will be removed in v5 of Transformers. UseHF_HOME
instead. warnings.warn( Using xformers cross attention Using xformers attention for VAE ControlNet preprocessor location: D:\AI\webui_forge_cu121_torch231\webui\models\ControlNetPreprocessor D:\AI\webui_forge_cu121_torch231\system\python\lib\site-packages\transformers\tokenization_utils_base.py:1601: FutureWarning:clean_up_tokenization_spaces
was not set. It will be set toTrue
by default. This behavior will be depracted in transformers v4.45, and will be then set toFalse
by default. For more details check this issue: https://github.com/huggingface/transformers/issues/31884 warnings.warn( Prompt Expansion: Vocab with 640 words. sd-webui-prompt-all-in-one background API service started successfully. 2024-08-12 09:48:59,142 - ControlNet - INFO - ControlNet UI callback registered. Model selected: {'checkpoint_info': {'filename': 'D:\AI\WebUI\models\Stable-diffusion\flux1-dev-bnb-nf4.safetensors', 'hash': '0184473b'}, 'vae_filename': None, 'unet_storage_dtype': 'nf4'} Running on local URL: http://127.0.0.1:7860To create a public link, set
share=True
inlaunch()
. IIB Database file has been successfully backed up to the backup folder. Startup time: 16.4s (prepare environment: 2.6s, launcher: 1.7s, import torch: 3.1s, initialize shared: 0.1s, other imports: 0.8s, load scripts: 2.2s, create ui: 3.1s, gradio launch: 1.2s, add APIs: 1.4s). Environment vars changed: {'stream': False, 'inference_memory': 1024.0, 'pin_shared_memory': False} Model selected: {'checkpoint_info': {'filename': 'D:\AI\WebUI\models\Stable-diffusion\flux1-dev-bnb-nf4.safetensors', 'hash': '0184473b'}, 'vae_filename': None, 'unet_storage_dtype': None} Model selected: {'checkpoint_info': {'filename': 'D:\AI\WebUI\models\Stable-diffusion\flux1-dev-bnb-nf4.safetensors', 'hash': '0184473b'}, 'vae_filename': None, 'unet_storage_dtype': 'nf4'} Loading Model: {'checkpoint_info': {'filename': 'D:\AI\WebUI\models\Stable-diffusion\flux1-dev-bnb-nf4.safetensors', 'hash': '0184473b'}, 'vae_filename': None, 'unet_storage_dtype': 'nf4'} ERROR: Exception in ASGI application Traceback (most recent call last): File "D:\AI\webui_forge_cu121_torch231\system\python\lib\site-packages\uvicorn\protocols\http\h11_impl.py", line 404, in run_asgi result = await app( # type: ignore[func-returns-value] File "D:\AI\webui_forge_cu121_torch231\system\python\lib\site-packages\uvicorn\middleware\proxy_headers.py", line 84, in call return await self.app(scope, receive, send) File "D:\AI\webui_forge_cu121_torch231\system\python\lib\site-packages\fastapi\applications.py", line 1106, in call await super().call(scope, receive, send) File "D:\AI\webui_forge_cu121_torch231\system\python\lib\site-packages\starlette\applications.py", line 122, in call await self.middleware_stack(scope, receive, send) File "D:\AI\webui_forge_cu121_torch231\system\python\lib\site-packages\starlette\middleware\errors.py", line 184, in call raise exc File "D:\AI\webui_forge_cu121_torch231\system\python\lib\site-packages\starlette\middleware\errors.py", line 162, in call await self.app(scope, receive, _send) File "D:\AI\webui_forge_cu121_torch231\system\python\lib\site-packages\gradio\route_utils.py", line 730, in call await self.simple_response(scope, receive, send, request_headers=headers) File "D:\AI\webui_forge_cu121_torch231\system\python\lib\site-packages\gradio\route_utils.py", line 746, in simple_response await self.app(scope, receive, send) File "D:\AI\webui_forge_cu121_torch231\system\python\lib\site-packages\starlette\middleware\exceptions.py", line 79, in call raise exc File "D:\AI\webui_forge_cu121_torch231\system\python\lib\site-packages\starlette\middleware\exceptions.py", line 68, in call await self.app(scope, receive, sender) File "D:\AI\webui_forge_cu121_torch231\system\python\lib\site-packages\fastapi\middleware\asyncexitstack.py", line 20, in call raise e File "D:\AI\webui_forge_cu121_torch231\system\python\lib\site-packages\fastapi\middleware\asyncexitstack.py", line 17, in call await self.app(scope, receive, send) File "D:\AI\webui_forge_cu121_torch231\system\python\lib\site-packages\starlette\routing.py", line 718, in call await route.handle(scope, receive, send) File "D:\AI\webui_forge_cu121_torch231\system\python\lib\site-packages\starlette\routing.py", line 276, in handle await self.app(scope, receive, send) File "D:\AI\webui_forge_cu121_torch231\system\python\lib\site-packages\starlette\routing.py", line 66, in app response = await func(request) File "D:\AI\webui_forge_cu121_torch231\system\python\lib\site-packages\fastapi\routing.py", line 274, in app raw_response = await run_endpoint_function( File "D:\AI\webui_forge_cu121_torch231\system\python\lib\site-packages\fastapi\routing.py", line 191, in run_endpoint_function return await dependant.call(values) File "D:\AI\webui_forge_cu121_torch231\webui\extensions\sd-webui-prompt-all-in-one\scripts\on_app_started.py", line 108, in _token_counter return get_token_counter(data['text'], data['steps']) File "D:\AI\webui_forge_cu121_torch231\webui\extensions\sd-webui-prompt-all-in-one\scripts\physton_prompt\get_token_counter.py", line 30, in get_token_counter cond_stage_model = sd_models.model_data.sd_model.cond_stage_model AttributeError: 'NoneType' object has no attribute 'cond_stage_model' StateDict Keys: {'transformer': 2350, 'vae': 244, 'text_encoder': 198, 'text_encoder_2': 220, 'ignore': 0} Using Detected T5 Data Type: torch.float8_e4m3fn Working with z of shape (1, 16, 32, 32) = 16384 dimensions. K-Model Created: {'storage_dtype': 'nf4', 'computation_dtype': torch.bfloat16} Model loaded in 2.3s (unload existing model: 0.4s, load state dict: 0.4s, forge model load: 1.6s). Skipping unconditional conditioning when CFG = 1. Negative Prompts are ignored. To load target model ModuleDict Begin to load 1 model [Memory Management] Current Free GPU Memory: 7085.16 MB [Memory Management] Required Model Memory: 5154.62 MB [Memory Management] Required Inference Memory: 1024.00 MB [Memory Management] Estimated Remaining GPU Memory: 906.55 MB Moving model(s) has taken 4.64 seconds Traceback (most recent call last): File "D:\AI\webui_forge_cu121_torch231\webui\modules_forge\main_thread.py", line 37, in loop task.work() File "D:\AI\webui_forge_cu121_torch231\webui\modules_forge\main_thread.py", line 26, in work self.result = self.func(*self.args, *self.kwargs) File "D:\AI\webui_forge_cu121_torch231\webui\modules\txt2img.py", line 110, in txt2img_function processed = processing.process_images(p) File "D:\AI\webui_forge_cu121_torch231\webui\modules\processing.py", line 799, in process_images res = process_images_inner(p) File "D:\AI\webui_forge_cu121_torch231\webui\modules\processing.py", line 912, in process_images_inner p.setup_conds() File "D:\AI\webui_forge_cu121_torch231\webui\modules\processing.py", line 1497, in setup_conds super().setup_conds() File "D:\AI\webui_forge_cu121_torch231\webui\modules\processing.py", line 494, in setup_conds self.c = self.get_conds_with_caching(prompt_parser.get_multicond_learned_conditioning, prompts, total_steps, [self.cached_c], self.extra_network_data) File "D:\AI\webui_forge_cu121_torch231\webui\modules\processing.py", line 463, in get_conds_with_caching cache[1] = function(shared.sd_model, required_prompts, steps, hires_steps, shared.opts.use_old_scheduling) File "D:\AI\webui_forge_cu121_torch231\webui\modules\prompt_parser.py", line 262, in get_multicond_learned_conditioning learned_conditioning = get_learned_conditioning(model, prompt_flat_list, steps, hires_steps, use_old_scheduling) File "D:\AI\webui_forge_cu121_torch231\webui\modules\prompt_parser.py", line 189, in get_learned_conditioning conds = model.get_learned_conditioning(texts) File "D:\AI\webui_forge_cu121_torch231\system\python\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context return func(args, kwargs) File "D:\AI\webui_forge_cu121_torch231\webui\backend\diffusion_engine\flux.py", line 79, in get_learned_conditioning cond_t5 = self.text_processing_engine_t5(prompt) File "D:\AI\webui_forge_cu121_torch231\webui\backend\text_processing\t5_engine.py", line 123, in call z = self.process_tokens([tokens], [multipliers])[0] File "D:\AI\webui_forge_cu121_torch231\webui\backend\text_processing\t5_engine.py", line 134, in process_tokens z = self.encode_with_transformers(tokens) File "D:\AI\webui_forge_cu121_torch231\webui\backend\text_processing\t5_engine.py", line 60, in encode_with_transformers z = self.text_encoder( File "D:\AI\webui_forge_cu121_torch231\system\python\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, kwargs) File "D:\AI\webui_forge_cu121_torch231\system\python\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl return forward_call(*args, *kwargs) File "D:\AI\webui_forge_cu121_torch231\webui\backend\nn\t5.py", line 205, in forward return self.encoder(x, args, kwargs) File "D:\AI\webui_forge_cu121_torch231\system\python\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, kwargs) File "D:\AI\webui_forge_cu121_torch231\system\python\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl return forward_call(*args, *kwargs) File "D:\AI\webui_forge_cu121_torch231\webui\backend\nn\t5.py", line 186, in forward x, past_bias = l(x, mask, past_bias) File "D:\AI\webui_forge_cu121_torch231\system\python\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl return self._call_impl(args, kwargs) File "D:\AI\webui_forge_cu121_torch231\system\python\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl return forward_call(*args, kwargs) File "D:\AI\webui_forge_cu121_torch231\webui\backend\nn\t5.py", line 162, in forward x, past_bias = self.layer[0](x, mask, past_bias) File "D:\AI\webui_forge_cu121_torch231\system\python\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, *kwargs) File "D:\AI\webui_forge_cu121_torch231\system\python\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl return forward_call(args, kwargs) File "D:\AI\webui_forge_cu121_torch231\webui\backend\nn\t5.py", line 149, in forward output, past_bias = self.SelfAttention(self.layer_norm(x), mask=mask, past_bias=past_bias) File "D:\AI\webui_forge_cu121_torch231\system\python\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, kwargs) File "D:\AI\webui_forge_cu121_torch231\system\python\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl return forward_call(*args, *kwargs) File "D:\AI\webui_forge_cu121_torch231\webui\backend\nn\t5.py", line 138, in forward out = attention_function(q, k ((k.shape[-1] / self.num_heads) 0.5), v, self.num_heads, mask) File "D:\AI\webui_forge_cu121_torch231\webui\backend\attention.py", line 314, in attention_xformers mask_out[:, :, :mask.shape[-1]] = mask RuntimeError: The expanded size of the tensor (1) must match the existing size (64) at non-singleton dimension 0. Target sizes: [1, 256, 256]. Tensor sizes: [64, 256, 256] The expanded size of the tensor (1) must match the existing size (64) at non-singleton dimension 0. Target sizes: [1, 256, 256]. Tensor sizes: [64, 256, 256] Error completing request Arguments: ('task(3r2lqymrx9uqldz)', <gradio.route_utils.Request object at 0x00000154827CFA00>, 'a girl', '', [], 1, 1, 1, 3.5, 1152, 896, False, 0.7, 2, 'Latent', 0, 0, 0, 'Use same checkpoint', 'Use same sampler', 'Use same scheduler', '', '', None, 0, 20, 'Euler', 'Simple', False, '', 0.8, -1, False, -1, 0, 0, 0, False, ControlNetUnit(input_mode=<InputMode.SIMPLE: 'simple'>, use_preview_as_input=False, batch_image_dir='', batch_mask_dir='', batch_input_gallery=None, batch_mask_gallery=None, generated_image=None, mask_image=None, mask_image_fg=None, hr_option='Both', enabled=False, module='None', model='None', weight=1, image=None, image_fg=None, resize_mode='Crop and Resize', processor_res=-1, threshold_a=-1, threshold_b=-1, guidance_start=0.0, guidance_end=1.0, pixel_perfect=False, control_mode='Balanced', save_detected_map=True), ControlNetUnit(input_mode=<InputMode.SIMPLE: 'simple'>, use_preview_as_input=False, batch_image_dir='', batch_mask_dir='', batch_input_gallery=None, batch_mask_gallery=None, generated_image=None, mask_image=None, mask_image_fg=None, hr_option='Both', enabled=False, module='None', model='None', weight=1, image=None, image_fg=None, resize_mode='Crop and Resize', processor_res=-1, threshold_a=-1, threshold_b=-1, guidance_start=0.0, guidance_end=1.0, pixel_perfect=False, control_mode='Balanced', save_detected_map=True), ControlNetUnit(input_mode=<InputMode.SIMPLE: 'simple'>, use_preview_as_input=False, batch_image_dir='', batch_mask_dir='', batch_input_gallery=None, batch_mask_gallery=None, generated_image=None, mask_image=None, mask_image_fg=None, hr_option='Both', enabled=False, module='None', model='None', weight=1, image=None, image_fg=None, resize_mode='Crop and Resize', processor_res=-1, threshold_a=-1, threshold_b=-1, guidance_start=0.0, guidance_end=1.0, pixel_perfect=False, control_mode='Balanced', save_detected_map=True), False, 7, 1, 'Constant', 0, 'Constant', 0, 1, 'enable', 'MEAN', 'AD', 1, False, 1.01, 1.02, 0.99, 0.95, False, 0.5, 2, False, 3, False, 3, 2, 0, 0.35, True, 'bicubic', 'bicubic', False, 0, 'anisotropic', 0, 'reinhard', 100, 0, 'subtract', 0, 0, 'gaussian', 'add', 0, 100, 127, 0, 'hard_clamp', 5, 0, 'None', 'None', False, 'MultiDiffusion', 768, 768, 64, 4, False, False, False, False, False, 'positive', 'comma', 0, False, False, 'start', '', 1, '', '', 0, '', '', 0, '', '', True, False, False, False, False, False, False, 0, False) {} Traceback (most recent call last): File "D:\AI\webui_forge_cu121_torch231\webui\modules\call_queue.py", line 74, in f res = list(func(*args, **kwargs)) TypeError: 'NoneType' object is not iterable