bmaltais / kohya_ss

Apache License 2.0
9.49k stars 1.22k forks source link

Does it work on Intel GPUs? #615

Closed Jaeha47 closed 8 months ago

Jaeha47 commented 1 year ago

It just stuck at " 0%| | 0/15 [00:00<?, ?it/s] " Is it because of my Intel Arc? Or did i do something wrong?

*edit ok now it gave the error

 raise ValueError(err.format(mode="fp16", requirement="a GPU"))
ValueError: fp16 mixed precision requires a GPU

So guess it doesnt with with Intel GPUs

/edit After setting Mixed Precision to no and Save Precision to float it in training parameter settings the first step worked but it then gives me a bounch of errors during the optimization.

E:\Arc Diffusion\Kohya\kohya_ss\venv\lib\site-packages\torch\utils\checkpoint.py:25: UserWarning: None of the inputs have requires_grad=True. Gradients will be None
  warnings.warn("None of the inputs have requires_grad=True. Gradients will be None")
E:\Arc Diffusion\Kohya\kohya_ss\venv\lib\site-packages\torch\amp\autocast_mode.py:198: UserWarning: User provided device_type of 'cuda', but CUDA is not available. Disabling
  warnings.warn('User provided device_type of \'cuda\', but CUDA is not available. Disabling')
Error CUDA driver version is insufficient for CUDA runtime version at line 167 in file D:\ai\tool\bitsandbytes\csrc\ops.cu
Traceback (most recent call last):
  File "C:\Users\denni\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\denni\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "E:\Arc Diffusion\Kohya\kohya_ss\venv\Scripts\accelerate.exe\__main__.py", line 7, in <module>
  File "E:\Arc Diffusion\Kohya\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 45, in main
    args.func(args)
  File "E:\Arc Diffusion\Kohya\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 1104, in launch_command
    simple_launcher(args)
  File "E:\Arc Diffusion\Kohya\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 567, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['E:\\Arc Diffusion\\Kohya\\kohya_ss\\venv\\Scripts\\python.exe', 'train_network.py', '--pretrained_model_name_or_path=runwayml/stable-diffusion-v1-5', '--train_data_dir=C:/Users/PC/Desktop/Stamp/CharLORA/image', '--resolution=512,512', '--output_dir=C:/Users/PC/Desktop/Stamp/CharLORA/model', '--logging_dir=C:/Users/PC/Desktop/Stamp/CharLORA/log', '--network_alpha=128', '--save_model_as=safetensors', '--network_module=networks.lora', '--text_encoder_lr=5e-5', '--unet_lr=0.0001', '--network_dim=128', '--output_name=Feste', '--lr_scheduler_num_cycles=1', '--learning_rate=0.0001', '--lr_scheduler=constant', '--train_batch_size=1', '--max_train_steps=1500', '--save_every_n_epochs=1', '--mixed_precision=no', '--save_precision=float', '--seed=1234', '--caption_extension=.txt', '--cache_latents', '--optimizer_type=AdamW8bit', '--max_data_loader_n_workers=1', '--clip_skip=2', '--bucket_reso_steps=64', '--mem_eff_attn', '--gradient_checkpointing', '--xformers', '--bucket_no_upscale']' returned non-zer
bmaltais commented 1 year ago

The solution is only supported under NVidia and (maybe) MacOS gpu... but I am not 100% sure if it work properly on MacOS Mx CPU/GPU

nonetrix commented 1 year ago

Really would like to see Intel Arc support since they have way more VRAM than the competition for cheaper despite the performance

Disty0 commented 12 months ago

Intel ARC is working now: https://www.technopat.net/sosyal/konu/installing-kohya-ss-with-intel-arc-gpus.2869152/

ep150de commented 9 months ago

I'm still running into issues with getting kohya training to run on intel arc A770 gpu after following the --use-ipex flag for setup and running the gui.

error traceback: Traceback (most recent call last): File "C:\Users\Demo\kohya_ss\sdxl_train_network.py", line 189, in trainer.train(args) File "C:\Users\Demo\kohya_ss\train_network.py", line 242, in train vae.set_use_memory_efficient_attention_xformers(args.xformers) File "C:\Users\Demo\kohya_ss\venv\lib\site-packages\diffusers\models\modeling_utils.py", line 263, in set_use_memory_efficient_attention_xformers fn_recursive_set_mem_eff(module) File "C:\Users\Demo\kohya_ss\venv\lib\site-packages\diffusers\models\modeling_utils.py", line 259, in fn_recursive_set_mem_eff fn_recursive_set_mem_eff(child) File "C:\Users\Demo\kohya_ss\venv\lib\site-packages\diffusers\models\modeling_utils.py", line 259, in fn_recursive_set_mem_eff fn_recursive_set_mem_eff(child) File "C:\Users\Demo\kohya_ss\venv\lib\site-packages\diffusers\models\modeling_utils.py", line 259, in fn_recursive_set_mem_eff fn_recursive_set_mem_eff(child) File "C:\Users\Demo\kohya_ss\venv\lib\site-packages\diffusers\models\modeling_utils.py", line 256, in fn_recursive_set_mem_eff module.set_use_memory_efficient_attention_xformers(valid, attention_op) File "C:\Users\Demo\kohya_ss\venv\lib\site-packages\diffusers\models\attention_processor.py", line 255, in set_use_memory_efficient_attention_xformers raise ValueError( ValueError: torch.cuda.is_available() should be True but is False. xformers' memory efficient attention is only available for GPU Traceback (most recent call last): File "C:\Users\Demo\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Users\Demo\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code exec(code, run_globals) File "C:\Users\Demo\kohya_ss\venv\Scripts\accelerate.exe__main__.py", line 7, in File "C:\Users\Demo\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 47, in main args.func(args) File "C:\Users\Demo\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 986, in launch_command simple_launcher(args) File "C:\Users\Demo\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 628, in simple_launcher raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd) subprocess.CalledProcessError: Command '['C:\Users\Demo\kohya_ss\venv\Scripts\python.exe', './sdxl_train_network.py', '--enable_bucket', '--min_bucket_reso=256', '--max_bucket_reso=2048', '--pretrained_model_name_or_path=stabilityai/stable-diffusion-xl-base-1.0', '--train_data_dir=C:/Users/Demo/Desktop/pat/round2 training\img', '--resolution=1024,1024', '--output_dir=C:/Users/Demo/Desktop/pat/round2 training\model', '--logging_dir=C:/Users/Demo/Desktop/pat/round2 training\log', '--network_alpha=1', '--save_model_as=safetensors', '--network_module=networks.lora', '--text_encoder_lr=0.0004', '--unet_lr=0.0004', '--network_dim=256', '--output_name=patgv7_arc', '--lr_scheduler_num_cycles=4', '--no_half_vae', '--learning_rate=0.0004', '--lr_scheduler=constant', '--train_batch_size=2', '--max_train_steps=1680', '--save_every_n_epochs=1', '--mixed_precision=bf16', '--save_precision=bf16', '--cache_latents', '--cache_latents_to_disk', '--optimizer_type=Adafactor', '--optimizer_args', 'scale_parameter=False', 'relative_step=False', 'warmup_init=False', '--max_grad_norm=1', '--max_data_loader_n_workers=0', '--bucket_reso_steps=64', '--save_state', '--gradient_checkpointing', '--xformers', '--bucket_no_upscale', '--noise_offset=0.0']' returned non-zero exit status 1.

Disty0 commented 9 months ago

Guide i sent tells you to use SDPA. Follow the guide.