Closed Pokilokui closed 8 months ago
Check if 'all' was entered as' a 'or' ALL 'during installation, which is not possible and must be typed in lowercase
Came here to log this as well, it occurs with the docker image which used to work fine but now doesn't, it looks like transformers was installed with a version that wasn't built with CUDA support.
WARNING[XFORMERS]: xFormers can't load C++/CUDA extensions. xFormers was built for:
PyTorch 2.0.1+cu118 with CUDA 1108 (you have 2.1.0+cu121)
Python 3.10.11 (you have 3.10.13)
...
NotImplementedError: No operator found for `memory_efficient_attention_forward` with inputs:
query : shape=(1, 2, 1, 40) (torch.float32)
key : shape=(1, 2, 1, 40) (torch.float32)
value : shape=(1, 2, 1, 40) (torch.float32)
attn_bias : <class 'NoneType'>
p : 0.0
`flshattF` is not supported because:
xFormers wasn't build with CUDA support
dtype=torch.float32 (supported: {torch.float16, torch.bfloat16})
Operator wasn't built - see `python -m xformers.info` for more info
`tritonflashattF` is not supported because:
xFormers wasn't build with CUDA support
dtype=torch.float32 (supported: {torch.float16, torch.bfloat16})
requires A100 GPU
Only work on pre-MLIR triton for now
`cutlassF` is not supported because:
xFormers wasn't build with CUDA support
Operator wasn't built - see `python -m xformers.info` for more info
`smallkF` is not supported because:
xFormers wasn't build with CUDA support
max(query.shape[-1] != value.shape[-1]) > 32
Operator wasn't built - see `python -m xformers.info` for more info
unsupported embed per head: 40
It also looks like tensorflow was built for CPU only but without any AVX acceleration:
This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI AVX512_BF16 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
I encountered the same problem, suddenly appeared. How to solve?
[Dataset 0]
loading image sizes.
100%|█████████████████████████████████████████████████████████████████████████████| 2626/2626 [00:13<00:00, 188.47it/s]
make buckets
min_bucket_reso and max_bucket_reso are ignored if bucket_no_upscale is set, because bucket reso is defined by image size automatically / bucket_no_upscaleが指定された場合は、bucketの解像度は画像サイズから自動計算されるため、min_bucket_resoとmax_bucket_resoは無視されます
number of images (including repeats) / 各bucketの画像枚数(繰り返し回数を含む)
bucket 0: resolution (320, 640), count: 100
bucket 1: resolution (384, 512), count: 400
bucket 2: resolution (384, 576), count: 1000
bucket 3: resolution (384, 640), count: 200
bucket 4: resolution (448, 576), count: 100
bucket 5: resolution (512, 384), count: 200
bucket 6: resolution (512, 448), count: 100
bucket 7: resolution (512, 512), count: 2600
bucket 8: resolution (576, 384), count: 400
bucket 9: resolution (576, 448), count: 100
mean ar error (without repeats): 0.00021692813305737463
preparing accelerator
loading model for process 0/1
load StableDiffusion checkpoint: D:/Lora Training/realisticVisionV51_v20Novae.safetensors
UNet2DConditionModel: 64, 8, 768, False, False
loading u-net:
I have the same issue. ValueError: torch.cuda.is_available() should be True but is False. xformers' memory efficient attention is only available for GPU
Can anyone help us on this issue??
Got the solution from another post. run 'accelerate config' in powershell When asking 'what gpu to use' part type in 'all'. ('ALL' in lower-case)
I'm in the same situation with WSL, I've tried everything and nothing works, it's strange and I don't quite understand why this is happening...
Seems broken. Anyone else know a decent lora alternative with a well-maintained repo?
made it work adding --gpu_ids=0
at line 724 of lora_gui.py
made it work adding
--gpu_ids=0
at line 724 oflora_gui.py
Hello,Could you show your code?thank you thankyou!!!
The error apparently is saying that the specific "xformers" feature's memory efficient attention is only available for Nvidia GPUs. I use an AMD GPU, so I just disabled it. I did that by: Under "Advanced", you'll see "CrossAttention" and it's "xformers" by default. I set that to "none".
For me, it also complained about fp16 or something. Up on top, under "Accelerate Launch" under "Mixed precision", use "no", I think that forces it to use full 32-bit precision. I was able to CPU train a model last night and it gave me a .safetensors file, but it took forever (6h) and the results were awful, so take that with a grain of salt.
Also, there's a few required fields. You'll at least need Model > Image folder (containing training images subfolders)
and Folders > Output directory for trained model
tried to set all in lower case to gpu select but does not work
Traceback (most recent call last): File "D:\AI\Kohya\kohya_ss\train_network.py", line 1009, in
trainer.train(args)
File "D:\AI\Kohya\kohya_ss\train_network.py", line 232, in train
vae.set_use_memory_efficient_attention_xformers(args.xformers)
File "D:\AI\Kohya\kohya_ss\venv\lib\site-packages\diffusers\models\modeling_utils.py", line 251, in set_use_memory_efficient_attention_xformers
fn_recursive_set_mem_eff(module)
File "D:\AI\Kohya\kohya_ss\venv\lib\site-packages\diffusers\models\modeling_utils.py", line 247, in fn_recursive_set_mem_eff
fn_recursive_set_mem_eff(child)
File "D:\AI\Kohya\kohya_ss\venv\lib\site-packages\diffusers\models\modeling_utils.py", line 247, in fn_recursive_set_mem_eff
fn_recursive_set_mem_eff(child)
File "D:\AI\Kohya\kohya_ss\venv\lib\site-packages\diffusers\models\modeling_utils.py", line 247, in fn_recursive_set_mem_eff
fn_recursive_set_mem_eff(child)
File "D:\AI\Kohya\kohya_ss\venv\lib\site-packages\diffusers\models\modeling_utils.py", line 244, in fn_recursive_set_mem_eff
module.set_use_memory_efficient_attention_xformers(valid, attention_op)
File "D:\AI\Kohya\kohya_ss\venv\lib\site-packages\diffusers\models\attention_processor.py", line 203, in set_use_memory_efficient_attention_xformers
raise ValueError(
ValueError: torch.cuda.is_available() should be True but is False. xformers' memory efficient attention is only available for GPU
Traceback (most recent call last):
File "C:\Users\Elouan\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\Elouan\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
exec(code, run_globals)
File "D:\AI\Kohya\kohya_ss\venv\Scripts\accelerate.exe__main__.py", line 7, in
File "D:\AI\Kohya\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 47, in main
args.func(args)
File "D:\AI\Kohya\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 986, in launch_command
simple_launcher(args)
File "D:\AI\Kohya\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 628, in simple_launcher
raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['D:\AI\Kohya\kohya_ss\venv\Scripts\python.exe', './train_network.py', '--enable_bucket', '--min_bucket_reso=256', '--max_bucket_reso=2048', '--pretrained_model_name_or_path=runwayml/stable-diffusion-v1-5', '--train_data_dir=D:/AI/Kohya/kunaboto/kunaboto style/image', '--resolution=512,512', '--output_dir=D:/AI/Kohya/kunaboto/kunaboto style/model', '--logging_dir=D:/AI/Kohya/kunaboto/kunaboto style/log', '--network_alpha=128', '--save_model_as=safetensors', '--network_module=networks.lora', '--text_encoder_lr=5e-05', '--unet_lr=0.0001', '--network_dim=128', '--output_name=Kunaboto style', '--lr_scheduler_num_cycles=1', '--no_half_vae', '--learning_rate=0.0001', '--lr_scheduler=constant', '--train_batch_size=2', '--max_train_steps=1300', '--save_every_n_epochs=1', '--mixed_precision=bf16', '--save_precision=bf16', '--seed=1234', '--caption_extension=.txt', '--cache_latents', '--optimizer_type=AdamW8bit', '--max_data_loader_n_workers=1', '--clip_skip=2', '--bucket_reso_steps=64', '--xformers', '--bucket_no_upscale', '--noise_offset=0.0']' returned non-zero exit status 1.