bmaltais / kohya_ss

Apache License 2.0
9.63k stars 1.24k forks source link

SD3.Flux branch, fresh install: RuntimeError: use_libuv was requested but PyTorch was build without libuv support #2763

Closed TinyForge closed 2 months ago

TinyForge commented 2 months ago

Environment

Description I'm trying to train a LoRA for the Flux model using the kohya_ss repository (flux branch). When running the training script, I encounter the following error:

RuntimeError: use_libuv was requested but PyTorch was build without libuv support

This error occurs when the accelerate library attempts to launch the script. Full error log

21:22:24-191231 INFO     Start training LoRA Flux1 ...
21:22:24-192234 INFO     Validating lr scheduler arguments...
21:22:24-194238 INFO     Validating optimizer arguments...
21:22:24-195235 INFO     Validating ./test/logs-saruman existence and writability... SUCCESS
21:22:24-196264 INFO     Validating C:/train/models existence and writability... SUCCESS
21:22:24-198252 INFO     Validating D:/AI/Kohya_ss/kohya_ss/models/flux1-dev.safetensors existence...
                         SUCCESS
21:22:24-199260 INFO     Validating D:/AI/train existence... SUCCESS
21:22:24-201261 INFO     Folder 1_owhx man: 1 repeats found
21:22:24-202262 INFO     Folder 1_owhx man: 20 images found
21:22:24-203234 INFO     Folder 1_owhx man: 20 * 1 = 20 steps
21:22:24-204234 INFO     Regulatization factor: 1
21:22:24-205233 INFO     Total steps: 20
21:22:24-206232 INFO     Train batch size: 1
21:22:24-208233 INFO     Gradient accumulation steps: 1
21:22:24-209261 INFO     Epoch: 1
21:22:24-210234 INFO     max_train_steps (20 / 1 / 1 * 1 * 1) = 20
21:22:24-211234 INFO     stop_text_encoder_training = 0
21:22:24-212234 INFO     lr_warmup_steps = 0
21:22:24-216234 INFO     Saving training config to
                         C:/train/models\Flux.test-v1.0_20240831-212224.json...
21:22:24-218235 INFO     Executing command: D:\AI\Kohya_ss\kohya_ss\venv\Scripts\accelerate.EXE launch
                         --dynamo_backend no --dynamo_mode default --gpu_ids 0,1 --mixed_precision
                         bf16 --multi_gpu --num_processes 2 --num_machines 1
                         --num_cpu_threads_per_process 2
                         D:/AI/Kohya_ss/kohya_ss/sd-scripts/flux_train_network.py --config_file
                         C:/train/models/config_lora-20240831-212224.toml
W0831 21:22:28.249000 6048 torch\distributed\elastic\multiprocessing\redirects.py:28] NOTE: Redirects are currently not supported in Windows or MacOs.
Traceback (most recent call last):
  File "C:\Python310\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Python310\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "D:\AI\Kohya_ss\kohya_ss\venv\Scripts\accelerate.EXE\__main__.py", line 7, in <module>
  File "D:\AI\Kohya_ss\kohya_ss\venv\lib\site-packages\accelerate\commands\accelerate_cli.py", line 48, in main
    args.func(args)
  File "D:\AI\Kohya_ss\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 1097, in launch_command
    multi_gpu_launcher(args)
  File "D:\AI\Kohya_ss\kohya_ss\venv\lib\site-packages\accelerate\commands\launch.py", line 734, in multi_gpu_launcher
    distrib_run.run(args)
  File "D:\AI\Kohya_ss\kohya_ss\venv\lib\site-packages\torch\distributed\run.py", line 892, in run
    elastic_launch(
  File "D:\AI\Kohya_ss\kohya_ss\venv\lib\site-packages\torch\distributed\launcher\api.py", line 133, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "D:\AI\Kohya_ss\kohya_ss\venv\lib\site-packages\torch\distributed\launcher\api.py", line 255, in launch_agent
    result = agent.run()
  File "D:\AI\Kohya_ss\kohya_ss\venv\lib\site-packages\torch\distributed\elastic\metrics\api.py", line 124, in wrapper
    result = f(*args, **kwargs)
  File "D:\AI\Kohya_ss\kohya_ss\venv\lib\site-packages\torch\distributed\elastic\agent\server\api.py", line 680, in run
    result = self._invoke_run(role)
  File "D:\AI\Kohya_ss\kohya_ss\venv\lib\site-packages\torch\distributed\elastic\agent\server\api.py", line 829, in _invoke_run
    self._initialize_workers(self._worker_group)
  File "D:\AI\Kohya_ss\kohya_ss\venv\lib\site-packages\torch\distributed\elastic\metrics\api.py", line 124, in wrapper
    result = f(*args, **kwargs)
  File "D:\AI\Kohya_ss\kohya_ss\venv\lib\site-packages\torch\distributed\elastic\agent\server\api.py", line 652, in _initialize_workers
    self._rendezvous(worker_group)
  File "D:\AI\Kohya_ss\kohya_ss\venv\lib\site-packages\torch\distributed\elastic\metrics\api.py", line 124, in wrapper
    result = f(*args, **kwargs)
  File "D:\AI\Kohya_ss\kohya_ss\venv\lib\site-packages\torch\distributed\elastic\agent\server\api.py", line 489, in _rendezvous
    rdzv_info = spec.rdzv_handler.next_rendezvous()
  File "D:\AI\Kohya_ss\kohya_ss\venv\lib\site-packages\torch\distributed\elastic\rendezvous\static_tcp_rendezvous.py", line 66, in next_rendezvous
    self._store = TCPStore(  # type: ignore[call-arg]
RuntimeError: use_libuv was requested but PyTorch was build without libuv support
21:22:31-620100 INFO     Training has ended.

Steps to reproduce

  1. Clone the kohya_ss repository and checkout the flux branch
  2. Install dependencies as per the requirements_pytorch_windows.txt file
  3. Run the training script using accelerate

What I've tried

  1. Verified PyTorch installation:

    import torch
    print(torch.__version__)
    print(torch.version.cuda)
    print(torch.cuda.is_available())

    Output:

    2.4.0+cu124
    12.4
    True
  2. Reinstalled PyTorch and torchvision with CUDA support:

    pip install torch==2.4.0+cu124 torchvision==0.19.0+cu124 --index-url https://download.pytorch.org/whl/cu124
  3. Updated accelerate:

    pip install --upgrade accelerate
  4. Ran accelerate config and chose 'NO' for using PyTorch's built-in distributed module

  5. Verified xformers installation:

    pip install xformers==0.0.27.post2
  6. Checked for conflicts with pip list

    (venv) PS D:\AI\Kohya_ss\kohya_ss> pip list
    Package                      Version      Editable project location
    ---------------------------- ------------ ----------------------------------
    absl-py                      2.1.0
    accelerate                   0.33.0
    aiofiles                     23.2.1
    aiohappyeyeballs             2.4.0
    aiohttp                      3.10.5
    aiosignal                    1.3.1
    altair                       4.2.2
    annotated-types              0.7.0
    antlr4-python3-runtime       4.9.3
    anyio                        4.4.0
    appdirs                      1.4.4
    astunparse                   1.6.3
    async-timeout                4.0.3
    attrs                        24.2.0
    bitsandbytes                 0.43.3
    certifi                      2022.12.7
    charset-normalizer           2.1.1
    click                        8.1.7
    colorama                     0.4.6
    coloredlogs                  15.0.1
    contourpy                    1.3.0
    cycler                       0.12.1
    dadaptation                  3.2
    diffusers                    0.25.0
    docker-pycreds               0.4.0
    easygui                      0.98.3
    einops                       0.7.0
    entrypoints                  0.4
    exceptiongroup               1.2.2
    fairscale                    0.4.13
    fastapi                      0.112.2
    ffmpy                        0.4.0
    filelock                     3.13.1
    flatbuffers                  24.3.25
    fonttools                    4.53.1
    frozenlist                   1.4.1
    fsspec                       2024.2.0
    ftfy                         6.1.1
    gast                         0.6.0
    gitdb                        4.0.11
    GitPython                    3.1.43
    google-pasta                 0.2.0
    gradio                       4.41.0
    gradio_client                1.3.0
    grpcio                       1.66.1
    h11                          0.14.0
    h5py                         3.11.0
    httpcore                     1.0.5
    httpx                        0.27.2
    huggingface-hub              0.24.5
    humanfriendly                10.0
    idna                         3.4
    imagesize                    1.4.1
    importlib_metadata           8.4.0
    importlib_resources          6.4.4
    invisible-watermark          0.2.0
    Jinja2                       3.1.3
    jsonschema                   4.23.0
    jsonschema-specifications    2023.12.1
    keras                        3.5.0
    kiwisolver                   1.4.5
    libclang                     18.1.1
    library                      0.0.0        d:\ai\kohya_ss\kohya_ss\sd-scripts
    lightning-utilities          0.11.6
    lion-pytorch                 0.0.6
    lycoris-lora                 2.2.0.post3
    Markdown                     3.7
    markdown-it-py               3.0.0
    MarkupSafe                   2.1.5
    matplotlib                   3.9.2
    mdurl                        0.1.2
    ml-dtypes                    0.4.0
    mpmath                       1.3.0
    multidict                    6.0.5
    namex                        0.0.8
    networkx                     3.2.1
    numpy                        1.26.3
    nvidia-cublas-cu11           11.11.3.6
    nvidia-cuda-nvrtc-cu11       11.8.89
    nvidia-cudnn-cu11            8.9.5.29
    omegaconf                    2.3.0
    onnx                         1.16.1
    onnxruntime-gpu              1.17.1
    open-clip-torch              2.20.0
    opencv-python                4.7.0.72
    opt-einsum                   3.3.0
    optree                       0.12.1
    orjson                       3.10.7
    packaging                    24.1
    pandas                       2.2.2
    pathtools                    0.1.2
    pillow                       10.2.0
    pip                          24.2
    prodigyopt                   1.0
    protobuf                     3.20.3
    psutil                       6.0.0
    pydantic                     2.8.2
    pydantic_core                2.20.1
    pydub                        0.25.1
    Pygments                     2.18.0
    pyparsing                    3.1.4
    pyreadline3                  3.4.1
    python-dateutil              2.9.0.post0
    python-multipart             0.0.9
    pytorch-lightning            1.9.0
    pytz                         2024.1
    PyWavelets                   1.7.0
    PyYAML                       6.0.2
    referencing                  0.35.1
    regex                        2024.7.24
    requests                     2.32.3
    rich                         13.8.0
    rpds-py                      0.20.0
    ruff                         0.6.3
    safetensors                  0.4.4
    scipy                        1.11.4
    semantic-version             2.10.0
    sentencepiece                0.2.0
    sentry-sdk                   2.13.0
    setproctitle                 1.3.3
    setuptools                   65.5.0
    shellingham                  1.5.4
    six                          1.16.0
    smmap                        5.0.1
    sniffio                      1.3.1
    starlette                    0.38.2
    sympy                        1.12
    tensorboard                  2.17.1
    tensorboard-data-server      0.7.2
    tensorflow                   2.17.0
    tensorflow-intel             2.17.0
    tensorflow-io-gcs-filesystem 0.31.0
    termcolor                    2.4.0
    timm                         0.6.12
    tk                           0.1.0
    tokenizers                   0.19.1
    toml                         0.10.2
    tomlkit                      0.12.0
    toolz                        0.12.1
    torch                        2.4.0+cu124
    torchaudio                   2.4.0
    torchmetrics                 1.4.1
    torchvision                  0.19.0+cu124
    tqdm                         4.66.5
    transformers                 4.44.0
    typer                        0.12.5
    typing_extensions            4.9.0
    tzdata                       2024.1
    urllib3                      2.2.2
    uvicorn                      0.30.6
    voluptuous                   0.13.1
    wandb                        0.15.11
    wcwidth                      0.2.13
    websockets                   11.0.3
    Werkzeug                     3.0.4
    wheel                        0.44.0
    wrapt                        1.16.0
    xformers                     0.0.27.post2
    yarl                         1.9.6
    zipp                         3.20.1

And here is a dump of my training parameters:

ae = "D:/AI/Kohya_ss/kohya_ss/models/ae.safetensors"
apply_t5_attn_mask = true
bucket_no_upscale = true
bucket_reso_steps = 64
cache_latents = true
cache_latents_to_disk = true
cache_text_encoder_outputs = true
cache_text_encoder_outputs_to_disk = true
caption_extension = ".txt"
clip_l = "D:/AI/Kohya_ss/kohya_ss/models/clip_l.safetensors"
clip_skip = 1
discrete_flow_shift = 3.0
dynamo_backend = "no"
enable_bucket = true
epoch = 1
fp8_base = true
gradient_accumulation_steps = 1
gradient_checkpointing = true
guidance_scale = 1.0
huber_c = 0.1
huber_schedule = "snr"
logging_dir = "./test/logs-saruman"
loss_type = "l2"
lr_scheduler = "constant"
lr_scheduler_args = []
lr_scheduler_num_cycles = 1
lr_scheduler_power = 1
max_bucket_reso = 512
max_data_loader_n_workers = 0
max_grad_norm = 1
max_timestep = 1000
max_train_epochs = 10
max_train_steps = 20
mem_eff_attn = true
min_bucket_reso = 256
min_snr_gamma = 7
mixed_precision = "fp8"
model_prediction_type = "raw"
network_alpha = 16
network_args = [ "train_blocks=single",]
network_dim = 16
network_module = "networks.lora_flux"
network_train_unet_only = true
noise_offset = 0.05
noise_offset_type = "Original"
optimizer_args = [ "relative_step=False", "scale_parameter=False",
"warmup_init=False",]
optimizer_type = "Adafactor"
output_dir = "C:/train/models"
output_name = "Flux.test-v1.0"
pretrained_model_name_or_path =
"D:/AI/Kohya_ss/kohya_ss/models/flux1-dev.safetensors"
prior_loss_weight = 1
resolution = "512,512"
sample_every_n_epochs = 1
sample_prompts = "C:/train/models\\sample/prompt.txt"
sample_sampler = "euler"
save_every_n_epochs = 1
save_every_n_steps = 50
save_model_as = "safetensors"
save_precision = "fp16"
sdpa = true
seed = 42
split_mode = true
t5xxl = "D:/AI/Kohya_ss/kohya_ss/models/t5xxl_fp16.safetensors"
t5xxl_max_token_length = 512
timestep_sampling = "sigmoid"
train_batch_size = 1
train_data_dir = "D:/AI/train"
unet_lr = 0.0003
wandb_run_name = "Flux.test-v1.0"

Any help in resolving this issue would really be appreciated. Let me know if you need any more information.

bmaltais commented 2 months ago

Look like a multi-guy specific issue. Does it work if you train on a single GPU? Multi-gpu has never worked well on windows. You need Linux for it.

TinyForge commented 2 months ago

Swapping to 1 GPU has resolved the error. I had no idea about multi-GPU support not being good on Windows. I'll load this up WSL in the future and give it a try. Marking this as resolved, thanks.