AUTOMATIC1111 / stable-diffusion-webui

Stable Diffusion web UI
GNU Affero General Public License v3.0
141.45k stars 26.73k forks source link

[Bug]: Broken Dreambooth LoRA training #11312

Open levicki opened 1 year ago

levicki commented 1 year ago

Is there an existing issue for this?

What happened?

LoRA training in bf16 precision is impossible on NVIDIA cards due to runtime error.

Please see related issue in pytorch which was resolved in newer build (2.1.0a0+git22ca1a1).

Steps to reproduce the problem

N/A

What should have happened?

N/A

Commit where the problem happens

baf6946e06249c5af9851c60171692c44ef633e0

What Python version are you running on ?

Python 3.10.x

What platforms do you use to access the UI ?

Windows

What device are you running WebUI on?

Nvidia GPUs (RTX 20 above)

What browsers do you use to access the UI ?

Brave

Command Line Arguments

--api --xformers

List of extensions

image

Console logs

RuntimeError: "triu_tril_cuda_template" not implemented for 'BFloat16'

Additional information

N/A

missionfloyd commented 1 year ago

resolved in newer build (2.1.0a0+git22ca1a1)

Torch 2.1 nightly can be installed by adding

set TORCH_COMMAND=pip install --pre torch torchvision --extra-index-url https://download.pytorch.org/whl/nightly/cu118

to webui-user.bat and, adding --reinstall-torch to COMMANDLINE_ARGS (remove it afterward.)

You could also try kohya_ss for training.

levicki commented 1 year ago

@missionfloyd

Thanks for responding.

Torch 2.1 nightly can be installed by adding...

I know, but then I enter the maze of dependencies (for example xformers 0.0.17 might not work with that and there are probably other packages which will complain).

You could also try kohya_ss for training.

I have already considered it, but it is less convenient to have another environment for training and having to switch between the two.

I would prefer if requirements for AUTOMATIC1111 were updated to use torch 2.1 (unless that breaks something else of course).

Edit: Oh, and it would also be nice if you supported CUDA 12.1 pytorch (https://download.pytorch.org/whl/nightly/cu121) — perhaps as an option during install?

w-e-w commented 1 year ago

you should be able to just set the env TORCH_INDEX_URL to the torch whl url https://github.com/AUTOMATIC1111/stable-diffusion-webui/blob/59419bd64a1581caccaac04dceb66c1c069a2db1/modules/launch_utils.py#L228

set TORCH_INDEX_URL=https://download.pytorch.org/whl/nightly/cu118