dicksondickson / lora-ease-wsl

Train LoRA with Ease
3 stars 0 forks source link

Is it compatible with Win10 wsl ? #1

Open cgiles opened 3 months ago

cgiles commented 3 months ago

I succeeded to install it on win10, with debian, in a venv, but it fails to try a lora.

Is it because I use win10 ?

My log :

Creating dataset 07/15/2024 14:26:06 - INFO - train_dreambooth_lora_sdxl_advanced - Distributed environment: DistributedType.NO Num processes: 1 Process index: 0 Local process index: 0 Device: cuda

Mixed precision type: bf16

You are using a model of type clip_text_model to instantiate a model of type . This is not supported for all configurations of models and can yield errors. You are using a model of type clip_text_model to instantiate a model of type . This is not supported for all configurations of models and can yield errors. {'thresholding', 'variance_type', 'rescale_betas_zero_snr', 'clip_sample_range', 'dynamic_thresholding_ratio'} was not found in config. Values will be initialized to default values. {'shift_factor', 'latents_std', 'use_quant_conv', 'latents_mean', 'use_post_quant_conv'} was not found in config. Values will be initialized to default values. {'dropout', 'reverse_transformer_layers_per_block', 'attention_type'} was not found in config. Values will be initialized to default values. 07/15/2024 14:26:09 - INFO - train_dreambooth_lora_sdxl_advanced - list of token identifiers: ['TOK'] 0 text encodedr's std_token_embedding: 0.015339338220655918 torch.Size([49410]) 1 text encodedr's std_token_embedding: 0.014393440447747707 torch.Size([49410]) Traceback (most recent call last): File "/home/my_name/train/venv/lib/python3.11/site-packages/gradio/routes.py", line 739, in predict output = await route_utils.call_process_api( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/my_name/train/venv/lib/python3.11/site-packages/gradio/route_utils.py", line 276, in call_process_api output = await app.get_blocks().process_api( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/my_name/train/venv/lib/python3.11/site-packages/gradio/blocks.py", line 1897, in process_api result = await self.call_function( ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/my_name/train/venv/lib/python3.11/site-packages/gradio/blocks.py", line 1483, in call_function prediction = await anyio.to_thread.run_sync( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/my_name/train/venv/lib/python3.11/site-packages/anyio/to_thread.py", line 56, in run_sync return await get_async_backend().run_sync_in_worker_thread( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/my_name/train/venv/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 2177, in run_sync_in_worker_thread return await future ^^^^^^^^^^^^ File "/home/my_name/train/venv/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 859, in run result = context.run(func, args) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/my_name/train/venv/lib/python3.11/site-packages/gradio/utils.py", line 816, in wrapper response = f(args, **kwargs) ^^^^^^^^^^^^^^^^^^ File "/home/my_name/train/lora-ease-wsl/app.py", line 532, in start_training_og train_main(args) File "/home/my_name/train/lora-ease-wsl/train_dreambooth_lora_sdxl_advanced.py", line 1227, in main unet.to(accelerator.device, dtype=weight_dtype) File "/home/my_name/train/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1173, in to return self._apply(convert) ^^^^^^^^^^^^^^^^^^^^ File "/home/my_name/train/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 779, in _apply module._apply(fn) File "/home/my_name/train/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 779, in _apply module._apply(fn) File "/home/my_name/train/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 779, in _apply module._apply(fn) [Previous line repeated 6 more times] File "/home/my_name/train/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 804, in _apply param_applied = fn(param) ^^^^^^^^^ File "/home/my_name/train/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1159, in convert return t.to( ^^^^^ RuntimeError: CUDA error: unknown error CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

dicksondickson commented 3 months ago

Can you try with WSL Ubuntu? I don't have Win10 and can't test but Win10 uses WSL1 which has problems. Win11 uses WSL2 and its much faster and and robust.

dicksondickson commented 3 months ago

Can you run nvidia-smi in the linux instance and post the output?

cgiles commented 3 months ago

Here is the nvidia-smi, I will install a clean ubuntu tomorrow, and report if it is different. +-----------------------------------------------------------------------------------------+ | NVIDIA-SMI 555.58.02 Driver Version: 556.12 CUDA Version: 12.5 | |-----------------------------------------+------------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+========================+======================| | 0 NVIDIA GeForce RTX 3060 On | 00000000:01:00.0 On | N/A | | 0% 53C P8 19W / 170W | 929MiB / 12288MiB | 22% Default | | | | N/A | +-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=========================================================================================| | No running processes found | +-----------------------------------------------------------------------------------------+

cgiles commented 3 months ago

So I made a clean wsl2 install of Ubuntu, and it worked. My computer is just too weak for make it happens 😔. But it can works on win10.

dicksondickson commented 3 months ago

Glad to hear it works. Yes LoRA training is compute intensive. I tried training on my 2080ti and a simple LoRA took 12hr.