SD3.Flux branch, fresh install: RuntimeError: use_libuv was requested but PyTorch was build without libuv support

Environment

OS: Windows 11
Python version: 3.10
PyTorch version: 2.4.0+cu124
CUDA version: 12.4.0
kohya_ss branch: SD3.Flux
GPU: 2 x 3060's with 12 GB VRAM

Description I'm trying to train a LoRA for the Flux model using the kohya_ss repository (flux branch). When running the training script, I encounter the following error:

RuntimeError: use_libuv was requested but PyTorch was build without libuv support

This error occurs when the accelerate library attempts to launch the script. Full error log

21:22:24-191231 INFO     Start training LoRA Flux1 ...
21:22:24-192234 INFO     Validating lr scheduler arguments...
21:22:24-194238 INFO     Validating optimizer arguments...
21:22:24-195235 INFO     Validating ./test/logs-saruman existence and writability... SUCCESS
21:22:24-196264 INFO     Validating C:/train/models existence and writability... SUCCESS
21:22:24-198252 INFO     Validating D:/AI/Kohya_ss/kohya_ss/models/flux1-dev.safetensors existence...
                         SUCCESS
21:22:24-199260 INFO     Validating D:/AI/train existence... SUCCESS
21:22:24-201261 INFO     Folder 1_owhx man: 1 repeats found
21:22:24-202262 INFO     Folder 1_owhx man: 20 images found
21:22:24-203234 INFO     Folder 1_owhx man: 20 * 1 = 20 steps
21:22:24-204234 INFO     Regulatization factor: 1
21:22:24-205233 INFO     Total steps: 20
21:22:24-206232 INFO     Train batch size: 1
21:22:24-208233 INFO     Gradient accumulation steps: 1
21:22:24-209261 INFO     Epoch: 1
21:22:24-210234 INFO     max_train_steps (20 / 1 / 1 * 1 * 1) = 20
21:22:24-211234 INFO     stop_text_encoder_training = 0
21:22:24-212234 INFO     lr_warmup_steps = 0
21:22:24-216234 INFO     Saving training config to
                         C:/train/models\Flux.test-v1.0_20240831-212224.json...
21:22:24-218235 INFO     Executing command: D:\AI\Kohya_ss\kohya_ss\venv\Scripts\accelerate.EXE launch
                         --dynamo_backend no --dynamo_mode default --gpu_ids 0,1 --mixed_precision
                         bf16 --multi_gpu --num_processes 2 --num_machines 1
                         --num_cpu_threads_per_process 2
                         D:/AI/Kohya_ss/kohya_ss/sd-scripts/flux_train_network.py --config_file
                         C:/train/models/config_lora-20240831-212224.toml
W0831 21:22:28.249000 6048 torch\distributed\elastic\multiprocessing\redirects.py:28] NOTE: Redirects are currently not supported in Windows or MacOs.
Traceback (most recent call last):
  File "C:\Python310\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Python310\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "D:\AI\Kohya_ss\kohya_ss\venv\Scripts\accelerate.EXE\__main__.py", line 7, in <module>
  File "D:\AI\Kohya_ss\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 48, in main
    args.func(args)
  File "D:\AI\Kohya_ss\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 1097, in launch_command
    multi_gpu_launcher(args)
  File "D:\AI\Kohya_ss\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 734, in multi_gpu_launcher
    distrib_run.run(args)
  File "D:\AI\Kohya_ss\kohya_ss\venv\lib\site-packages\torch\distributed\run.py", line 892, in run
    elastic_launch(
  File "D:\AI\Kohya_ss\kohya_ss\venv\lib\site-packages\torch\distributed\launcher\api.py", line 133, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "D:\AI\Kohya_ss\kohya_ss\venv\lib\site-packages\torch\distributed\launcher\api.py", line 255, in launch_agent
    result = agent.run()
  File "D:\AI\Kohya_ss\kohya_ss\venv\lib\site-packages\torch\distributed\elastic\metrics\api.py", line 124, in wrapper
    result = f(*args, **kwargs)
  File "D:\AI\Kohya_ss\kohya_ss\venv\lib\site-packages\torch\distributed\elastic\agent\server\api.py", line 680, in run
    result = self._invoke_run(role)
  File "D:\AI\Kohya_ss\kohya_ss\venv\lib\site-packages\torch\distributed\elastic\agent\server\api.py", line 829, in _invoke_run
    self._initialize_workers(self._worker_group)
  File "D:\AI\Kohya_ss\kohya_ss\venv\lib\site-packages\torch\distributed\elastic\metrics\api.py", line 124, in wrapper
    result = f(*args, **kwargs)
  File "D:\AI\Kohya_ss\kohya_ss\venv\lib\site-packages\torch\distributed\elastic\agent\server\api.py", line 652, in _initialize_workers
    self._rendezvous(worker_group)
  File "D:\AI\Kohya_ss\kohya_ss\venv\lib\site-packages\torch\distributed\elastic\metrics\api.py", line 124, in wrapper
    result = f(*args, **kwargs)
  File "D:\AI\Kohya_ss\kohya_ss\venv\lib\site-packages\torch\distributed\elastic\agent\server\api.py", line 489, in _rendezvous
    rdzv_info = spec.rdzv_handler.next_rendezvous()
  File "D:\AI\Kohya_ss\kohya_ss\venv\lib\site-packages\torch\distributed\elastic\rendezvous\static_tcp_rendezvous.py", line 66, in next_rendezvous
    self._store = TCPStore(  # type: ignore[call-arg]
RuntimeError: use_libuv was requested but PyTorch was build without libuv support
21:22:31-620100 INFO     Training has ended.

Steps to reproduce

Clone the kohya_ss repository and checkout the flux branch
Install dependencies as per the requirements_pytorch_windows.txt file
Run the training script using accelerate

What I've tried

Verified PyTorch installation:

import torch
print(torch.__version__)
print(torch.version.cuda)
print(torch.cuda.is_available())

Output:

2.4.0+cu124
12.4
True

Reinstalled PyTorch and torchvision with CUDA support:

pip install torch==2.4.0+cu124 torchvision==0.19.0+cu124 --index-url https://download.pytorch.org/whl/cu124

Updated accelerate:
```
pip install --upgrade accelerate
```
Ran accelerate config and chose 'NO' for using PyTorch's built-in distributed module
Verified xformers installation:
```
pip install xformers==0.0.27.post2
```

Checked for conflicts with pip list

(venv) PS D:\AI\Kohya_ss\kohya_ss> pip list
Package                      Version      Editable project location
---------------------------- ------------ ----------------------------------
absl-py                      2.1.0
accelerate                   0.33.0
aiofiles                     23.2.1
aiohappyeyeballs             2.4.0
aiohttp                      3.10.5
aiosignal                    1.3.1
altair                       4.2.2
annotated-types              0.7.0
antlr4-python3-runtime       4.9.3
anyio                        4.4.0
appdirs                      1.4.4
astunparse                   1.6.3
async-timeout                4.0.3
attrs                        24.2.0
bitsandbytes                 0.43.3
certifi                      2022.12.7
charset-normalizer           2.1.1
click                        8.1.7
colorama                     0.4.6
coloredlogs                  15.0.1
contourpy                    1.3.0
cycler                       0.12.1
dadaptation                  3.2
diffusers                    0.25.0
docker-pycreds               0.4.0
easygui                      0.98.3
einops                       0.7.0
entrypoints                  0.4
exceptiongroup               1.2.2
fairscale                    0.4.13
fastapi                      0.112.2
ffmpy                        0.4.0
filelock                     3.13.1
flatbuffers                  24.3.25
fonttools                    4.53.1
frozenlist                   1.4.1
fsspec                       2024.2.0
ftfy                         6.1.1
gast                         0.6.0
gitdb                        4.0.11
GitPython                    3.1.43
google-pasta                 0.2.0
gradio                       4.41.0
gradio_client                1.3.0
grpcio                       1.66.1
h11                          0.14.0
h5py                         3.11.0
httpcore                     1.0.5
httpx                        0.27.2
huggingface-hub              0.24.5
humanfriendly                10.0
idna                         3.4
imagesize                    1.4.1
importlib_metadata           8.4.0
importlib_resources          6.4.4
invisible-watermark          0.2.0
Jinja2                       3.1.3
jsonschema                   4.23.0
jsonschema-specifications    2023.12.1
keras                        3.5.0
kiwisolver                   1.4.5
libclang                     18.1.1
library                      0.0.0        d:\ai\kohya_ss\kohya_ss\sd-scripts
lightning-utilities          0.11.6
lion-pytorch                 0.0.6
lycoris-lora                 2.2.0.post3
Markdown                     3.7
markdown-it-py               3.0.0
MarkupSafe                   2.1.5
matplotlib                   3.9.2
mdurl                        0.1.2
ml-dtypes                    0.4.0
mpmath                       1.3.0
multidict                    6.0.5
namex                        0.0.8
networkx                     3.2.1
numpy                        1.26.3
nvidia-cublas-cu11           11.11.3.6
nvidia-cuda-nvrtc-cu11       11.8.89
nvidia-cudnn-cu11            8.9.5.29
omegaconf                    2.3.0
onnx                         1.16.1
onnxruntime-gpu              1.17.1
open-clip-torch              2.20.0
opencv-python                4.7.0.72
opt-einsum                   3.3.0
optree                       0.12.1
orjson                       3.10.7
packaging                    24.1
pandas                       2.2.2
pathtools                    0.1.2
pillow                       10.2.0
pip                          24.2
prodigyopt                   1.0
protobuf                     3.20.3
psutil                       6.0.0
pydantic                     2.8.2
pydantic_core                2.20.1
pydub                        0.25.1
Pygments                     2.18.0
pyparsing                    3.1.4
pyreadline3                  3.4.1
python-dateutil              2.9.0.post0
python-multipart             0.0.9
pytorch-lightning            1.9.0
pytz                         2024.1
PyWavelets                   1.7.0
PyYAML                       6.0.2
referencing                  0.35.1
regex                        2024.7.24
requests                     2.32.3
rich                         13.8.0
rpds-py                      0.20.0
ruff                         0.6.3
safetensors                  0.4.4
scipy                        1.11.4
semantic-version             2.10.0
sentencepiece                0.2.0
sentry-sdk                   2.13.0
setproctitle                 1.3.3
setuptools                   65.5.0
shellingham                  1.5.4
six                          1.16.0
smmap                        5.0.1
sniffio                      1.3.1
starlette                    0.38.2
sympy                        1.12
tensorboard                  2.17.1
tensorboard-data-server      0.7.2
tensorflow                   2.17.0
tensorflow-intel             2.17.0
tensorflow-io-gcs-filesystem 0.31.0
termcolor                    2.4.0
timm                         0.6.12
tk                           0.1.0
tokenizers                   0.19.1
toml                         0.10.2
tomlkit                      0.12.0
toolz                        0.12.1
torch                        2.4.0+cu124
torchaudio                   2.4.0
torchmetrics                 1.4.1
torchvision                  0.19.0+cu124
tqdm                         4.66.5
transformers                 4.44.0
typer                        0.12.5
typing_extensions            4.9.0
tzdata                       2024.1
urllib3                      2.2.2
uvicorn                      0.30.6
voluptuous                   0.13.1
wandb                        0.15.11
wcwidth                      0.2.13
websockets                   11.0.3
Werkzeug                     3.0.4
wheel                        0.44.0
wrapt                        1.16.0
xformers                     0.0.27.post2
yarl                         1.9.6
zipp                         3.20.1

And here is a dump of my training parameters:

ae = "D:/AI/Kohya_ss/kohya_ss/models/ae.safetensors"
apply_t5_attn_mask = true
bucket_no_upscale = true
bucket_reso_steps = 64
cache_latents = true
cache_latents_to_disk = true
cache_text_encoder_outputs = true
cache_text_encoder_outputs_to_disk = true
caption_extension = ".txt"
clip_l = "D:/AI/Kohya_ss/kohya_ss/models/clip_l.safetensors"
clip_skip = 1
discrete_flow_shift = 3.0
dynamo_backend = "no"
enable_bucket = true
epoch = 1
fp8_base = true
gradient_accumulation_steps = 1
gradient_checkpointing = true
guidance_scale = 1.0
huber_c = 0.1
huber_schedule = "snr"
logging_dir = "./test/logs-saruman"
loss_type = "l2"
lr_scheduler = "constant"
lr_scheduler_args = []
lr_scheduler_num_cycles = 1
lr_scheduler_power = 1
max_bucket_reso = 512
max_data_loader_n_workers = 0
max_grad_norm = 1
max_timestep = 1000
max_train_epochs = 10
max_train_steps = 20
mem_eff_attn = true
min_bucket_reso = 256
min_snr_gamma = 7
mixed_precision = "fp8"
model_prediction_type = "raw"
network_alpha = 16
network_args = [ "train_blocks=single",]
network_dim = 16
network_module = "networks.lora_flux"
network_train_unet_only = true
noise_offset = 0.05
noise_offset_type = "Original"
optimizer_args = [ "relative_step=False", "scale_parameter=False",
"warmup_init=False",]
optimizer_type = "Adafactor"
output_dir = "C:/train/models"
output_name = "Flux.test-v1.0"
pretrained_model_name_or_path =
"D:/AI/Kohya_ss/kohya_ss/models/flux1-dev.safetensors"
prior_loss_weight = 1
resolution = "512,512"
sample_every_n_epochs = 1
sample_prompts = "C:/train/models\\sample/prompt.txt"
sample_sampler = "euler"
save_every_n_epochs = 1
save_every_n_steps = 50
save_model_as = "safetensors"
save_precision = "fp16"
sdpa = true
seed = 42
split_mode = true
t5xxl = "D:/AI/Kohya_ss/kohya_ss/models/t5xxl_fp16.safetensors"
t5xxl_max_token_length = 512
timestep_sampling = "sigmoid"
train_batch_size = 1
train_data_dir = "D:/AI/train"
unet_lr = 0.0003
wandb_run_name = "Flux.test-v1.0"

Any help in resolving this issue would really be appreciated. Let me know if you need any more information.

bmaltais / kohya_ss

SD3.Flux branch, fresh install: RuntimeError: use_libuv was requested but PyTorch was build without libuv support #2763