ShivamShrirao / diffusers

šŸ¤— Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch
https://huggingface.co/docs/diffusers
Apache License 2.0
1.89k stars 507 forks source link

Nothing happens when trying to run the dreambooth script from the command line #60

Open ProGamerGov opened 2 years ago

ProGamerGov commented 2 years ago

Describe the bug

I tried running the code earlier, and nothing seemed to happen after I ran the script via cmd:

I literally start the instance, upload my images, download the models and run the following code:

user@instance-1:~$ wget -q https://github.com/ShivamShrirao/diffusers/raw/main/examples/dreambooth/train_dreambooth.py
user@instance-1:~$ pip install -qq git+https://github.com/ShivamShrirao/diffusers
user@instance-1:~$ pip install -q -U --pre triton
user@instance-1:~$ pip install -q accelerate==0.12.0 transformers ftfy bitsandbytes gradio
user@instance-1:~$ accelerate launch train_dreambooth.py --save_sample_prompt "a photo of sks <concept>" --pretrained_model_name_or_path="v1-5-pruned-emaonly_ema_vae.ckpt" --instance_data_dir="./Dreambooth-Stable-Diffusion/training_images" --class_data_dir="./Dreambooth-Stable-Diffusion/regularization_images/<concept>" --output_dir="text-inversion-model" --with_prior_preservation --prior_loss_weight=1.0 --instance_prompt="photo of sks <concept>" --class_prompt="<concept>" --seed=1337 --resolution=512 --train_batch_size=1 --train_text_encoder --mixed_precision="no" --gradient_accumulation_steps=1 --learning_rate=1e-6 --lr_scheduler="constant" --lr_warmup_steps=0 --num_class_images=2000 --sample_batch_size=4 --max_train_steps=15000 --save_interval=500 --pretrained_vae_name_or_path="vae-ft-ema-560000-ema-pruned.ckpt"
The following values were not passed to `accelerate launch` and had defaults used instead:
        `--num_processes` was set to a value of `1`
        `--num_machines` was set to a value of `1`
        `--mixed_precision` was set to a value of `'no'`
        `--num_cpu_threads_per_process` was set to `6` to improve out-of-box performance
To avoid this warning pass in values for each of the problematic parameters or run `accelerate config`.
[!] Not using xformers memory efficient attention.

^CTraceback (most recent call last):
  File "/opt/conda/bin/accelerate", line 8, in <module>
    sys.exit(main())
  File "/opt/conda/lib/python3.7/site-packages/accelerate/commands/accelerate_cli.py", line 43, in main
    args.func(args)
  File "/opt/conda/lib/python3.7/site-packages/accelerate/commands/launch.py", line 837, in launch_command
    simple_launcher(args)
  File "/opt/conda/lib/python3.7/site-packages/accelerate/commands/launch.py", line 352, in simple_launcher
    process.wait()
  File "/opt/conda/lib/python3.7/subprocess.py", line 1019, in wait
    return self._wait(timeout=timeout)
  File "/opt/conda/lib/python3.7/subprocess.py", line 1653, in _wait
    (pid, sts) = self._try_wait(0)
  File "/opt/conda/lib/python3.7/subprocess.py", line 1611, in _try_wait
    (pid, sts) = os.waitpid(self.pid, wait_flags)
KeyboardInterrupt

I tried again without the accelerate stuff:

user@instance-1:~$ python train_dreambooth.py --save_sample_prompt "a photo of sks <concept>" --pretrained_model_name_or_path="v1-5-pruned-emaonly_ema_vae.ckpt" --instance_data_dir="./Dreambooth-Stable-Diffusion/training_images" --class_data_dir="./Dreambooth-Stable-Diffusion/regularization_images/<concept>" --output_dir="text-inversion-model" --with_prior_preservation --prior_loss_weight=1.0 --instance_prompt="photo of sks <concept>" --class_prompt="<concept>" --seed=1337 --resolution=512 --train_batch_size=1 --train_text_encoder --mixed_precision="no" --gradient_accumulation_steps=1 --learning_rate=1e-6 --lr_scheduler="constant" --lr_warmup_steps=0 --num_class_images=2000 --sample_batch_size=4 --max_train_steps=15000 --save_interval=500 --pretrained_vae_name_or_path="vae-ft-ema-560000-ema-pruned.ckpt"
[!] Not using xformers memory efficient attention.

It'd be helpful if there was some sort of indication if stuff was happening behind the scenes.

Reproduction

No response

Logs

No response

System Info

Debian Instance on GCP with an A100 40GB graphics card.

ProGamerGov commented 2 years ago

I got a bit further by doing:

huggingface-cli login
accelerate launch train_dreambooth.py --save_sample_prompt "a photo of sks <concept>" --pretrained_model_name_or_path "v1-5-pruned-emaonly.ckpt" --instance_data_dir "training_images" --class_data_dir "<concept>" --output_dir "text-inversion-model" --with_prior_preservation --prior_loss_weight 1.0 --instance_prompt "photo of sks <concept>" --class_prompt "<concept>" --seed 1337 --resolution 512 --train_batch_size 1 --train_text_encoder --mixed_precision "no" --gradient_accumulation_steps 1 --learning_rate 1e-6 --lr_scheduler "constant" --lr_warmup_steps 0 --num_class_images 2000 --sample_batch_size 4 --max_train_steps 15000 --save_interval 500 --pretrained_vae_name_or_path "vae-ft-ema-560000-ema-pruned.ckpt"

But it still ends up doing nothing with no indication of what's wrong.

ProGamerGov commented 2 years ago

The script just hangs, with indication of any errors or progress:

user@instance-1:~$ accelerate launch train_dreambooth.py --save_sample_prompt "a photo of sks <concept>" --pretrained_model_name_or_path "runwayml/stable-diffusion-v1-5" --pretrained_vae_name_or_path="stabilityai/sd-vae-ft-ema" --instance_data_dir "training_images" --class_data_dir <concept> --output_dir "text-inversion-model" --with_prior_preservation --prior_loss_weight 1.0 --instance_prompt "photo of sks <concept>" --class_prompt "<concept>" --seed 1337 --resolution 512 --train_batch_size 1 --train_text_encoder --mixed_precision "no" --gradient_accumulation_steps 1 --learning_rate 1e-6 --lr_scheduler "constant" --lr_warmup_steps 0 --num_class_images 2000 --sample_batch_size 4 --max_train_steps 15000 --save_interval 500
The following values were not passed to `accelerate launch` and had defaults used instead:
        `--num_cpu_threads_per_process` was set to `6` to improve out-of-box performance
To avoid this warning pass in values for each of the problematic parameters or run `accelerate config`.
[!] Not using xformers memory efficient attention.
/opt/conda/lib/python3.7/site-packages/accelerate/accelerator.py:179: UserWarning: `log_with=tensorboard` was passed but no supported trackers are currently installed.
  warnings.warn(f"`log_with={log_with}` was passed but no supported trackers are currently installed.")

This is what I have install on the instance:

Collecting environment information...
PyTorch version: 1.11.0
Is debug build: False
CUDA used to build PyTorch: 11.3
ROCM used to build PyTorch: N/A

OS: Debian GNU/Linux 10 (buster) (x86_64)
GCC version: (Debian 8.3.0-6) 8.3.0
Clang version: Could not collect
CMake version: version 3.24.1
Libc version: glibc-2.10

Python version: 3.7.12 | packaged by conda-forge | (default, Oct 26 2021, 06:08:53)  [GCC 9.4.0] (64-bit runtime)
Python platform: Linux-4.19.0-20-cloud-amd64-x86_64-with-debian-10.13
Is CUDA available: True
CUDA runtime version: 11.3.109
CUDA_MODULE_LOADING set to: 
GPU models and configuration: GPU 0: NVIDIA A100-SXM4-40GB
Nvidia driver version: 470.57.02
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

Versions of relevant libraries:
[pip3] mypy-extensions==0.4.3
[pip3] numpy==1.19.5
[pip3] torch==1.11.0
[pip3] torch-xla==1.11
[pip3] torchvision==0.12.0+cu113
[conda] blas                      2.115                       mkl    conda-forge
[conda] blas-devel                3.9.0            15_linux64_mkl    conda-forge
[conda] cudatoolkit               11.3.1               ha36c431_9    nvidia
[conda] dlenv-pytorch-1-11-gpu    1.0.20220630     py37hc1c1d6d_0    file:///tmp/conda-pkgs
[conda] libblas                   3.9.0            15_linux64_mkl    conda-forge
[conda] libcblas                  3.9.0            15_linux64_mkl    conda-forge
[conda] liblapack                 3.9.0            15_linux64_mkl    conda-forge
[conda] liblapacke                3.9.0            15_linux64_mkl    conda-forge
[conda] mkl                       2022.1.0           h84fe81f_915    conda-forge
[conda] mkl-devel                 2022.1.0           ha770c72_916    conda-forge
[conda] mkl-include               2022.1.0           h84fe81f_915    conda-forge
[conda] numpy                     1.19.5           py37h3e96413_3    conda-forge
[conda] pytorch                   1.11.0          py3.7_cuda11.3_cudnn8.2.0_0    pytorch
[conda] pytorch-mutex             1.0                        cuda    pytorch
[conda] torchvision               0.12.0+cu113             pypi_0    pypi

pip list output:

Package                               Version
------------------------------------- -----------------
accelerate                            0.12.0
aiohttp                               3.8.1
aiosignal                             1.2.0
ansiwrap                              0.8.4
anyio                                 3.6.1
appdirs                               1.4.4
argon2-cffi                           21.3.0
argon2-cffi-bindings                  21.2.0
arrow                                 1.2.2
asn1crypto                            1.5.1
async-timeout                         4.0.2
asynctest                             0.13.0
attrs                                 21.4.0
Babel                                 2.10.3
backcall                              0.2.0
backports.functools-lru-cache         1.6.4
bcrypt                                4.0.1
beatrix-jupyterlab                    3.1.7
beautifulsoup4                        4.11.1
binaryornot                           0.4.4
bitsandbytes                          0.35.0
black                                 22.6.0
bleach                                5.0.1
blinker                               1.4
brotlipy                              0.7.0
cachetools                            5.0.0
certifi                               2022.6.15
cffi                                  1.15.0
chardet                               5.0.0
charset-normalizer                    2.1.0
click                                 8.1.3
cloudpickle                           2.1.0
cmake                                 3.24.1.1
colorama                              0.4.5
conda                                 4.13.0
conda-package-handling                1.8.1
cookiecutter                          2.1.1
cryptography                          37.0.2
cycler                                0.11.0
dataclasses                           0.8
debugpy                               1.6.0
decorator                             5.1.1
defusedxml                            0.7.1
diffusers                             0.7.0.dev0
docker                                5.0.3
docker-pycreds                        0.4.0
entrypoints                           0.4
fastapi                               0.85.1
fastjsonschema                        2.15.3
ffmpy                                 0.3.0
filelock                              3.8.0
flit_core                             3.7.1
fonttools                             4.33.3
frozenlist                            1.3.0
fsspec                                2022.5.0
ftfy                                  6.1.1
gcsfs                                 2022.5.0
gitdb                                 4.0.9
GitPython                             3.1.27
google-api-core                       2.8.1
google-api-python-client              2.52.0
google-auth                           2.9.0
google-auth-httplib2                  0.1.0
google-auth-oauthlib                  0.5.2
google-cloud-aiplatform               1.15.0
google-cloud-appengine-logging        1.1.2
google-cloud-audit-log                0.2.2
google-cloud-bigquery                 2.34.4
google-cloud-bigquery-storage         2.13.2
google-cloud-bigtable                 2.10.1
google-cloud-core                     2.3.1
google-cloud-dataproc                 4.0.3
google-cloud-datastore                2.7.1
google-cloud-firestore                2.5.3
google-cloud-kms                      2.11.2
google-cloud-language                 2.4.3
google-cloud-logging                  3.1.2
google-cloud-monitoring               2.9.2
google-cloud-pubsub                   1.7.0
google-cloud-resource-manager         1.5.1
google-cloud-scheduler                2.6.4
google-cloud-spanner                  3.15.1
google-cloud-speech                   2.14.1
google-cloud-storage                  2.4.0
google-cloud-tasks                    2.9.1
google-cloud-translate                3.7.4
google-cloud-videointelligence        2.7.1
google-cloud-vision                   2.7.3
google-crc32c                         1.1.2
google-resumable-media                2.3.3
googleapis-common-protos              1.56.3
gradio                                3.6
greenlet                              1.1.2
grpc-google-iam-v1                    0.12.4
grpcio                                1.47.0
grpcio-gcp                            0.2.2
grpcio-status                         1.47.0
h11                                   0.12.0
htmlmin                               0.1.12
httpcore                              0.15.0
httplib2                              0.20.4
httpx                                 0.23.0
huggingface-hub                       0.10.1
idna                                  3.3
ImageHash                             4.2.1
importlib-metadata                    4.11.4
importlib-resources                   5.8.0
ipykernel                             6.15.0
ipython                               7.33.0
ipython-genutils                      0.2.0
ipython-sql                           0.3.9
jedi                                  0.18.1
jeepney                               0.8.0
Jinja2                                3.1.2
jinja2-time                           0.2.0
joblib                                1.1.0
json5                                 0.9.5
jsonschema                            4.6.1
jupyter-client                        7.3.4
jupyter-core                          4.10.0
jupyter-http-over-ws                  0.0.8
jupyter-server                        1.18.0
jupyter-server-mathjax                0.2.5
jupyter-server-proxy                  3.2.1
jupyterlab                            3.2.9
jupyterlab-git                        0.37.1
jupyterlab-pygments                   0.2.2
jupyterlab-server                     2.14.0
jupytext                              1.13.8
keyring                               23.6.0
keyrings.google-artifactregistry-auth 1.0.0
kiwisolver                            1.4.3
kubernetes                            24.2.0
linkify-it-py                         1.0.3
llvmlite                              0.38.1
Markdown                              3.3.7
markdown-it-py                        2.1.0
MarkupSafe                            2.1.1
matplotlib                            3.5.2
matplotlib-inline                     0.1.3
mdit-py-plugins                       0.3.0
mdurl                                 0.1.0
missingno                             0.4.2
mistune                               0.8.4
multidict                             6.0.2
multimethod                           1.4
munkres                               1.1.4
mypy-extensions                       0.4.3
nb-conda                              2.2.1
nb-conda-kernels                      2.3.1
nbclassic                             0.3.7
nbclient                              0.6.5
nbconvert                             6.5.0
nbdime                                3.1.1
nbformat                              5.4.0
nest-asyncio                          1.5.5
networkx                              2.7.1
notebook                              6.4.12
notebook-executor                     0.2
notebook-shim                         0.1.0
numba                                 0.55.2
numpy                                 1.19.5
oauthlib                              3.2.0
orjson                                3.8.0
packaging                             21.3
pandas                                1.3.5
pandas-profiling                      3.2.0
pandocfilters                         1.5.0
papermill                             2.3.4
paramiko                              2.11.0
parso                                 0.8.3
pathspec                              0.9.0
patsy                                 0.5.2
pexpect                               4.8.0
phik                                  0.12.2
pickleshare                           0.7.5
Pillow                                9.1.1
pip                                   22.1.2
platformdirs                          2.5.1
pluggy                                1.0.0
prettytable                           3.3.0
prometheus-client                     0.14.1
prompt-toolkit                        3.0.30
proto-plus                            1.20.6
protobuf                              3.20.1
psutil                                5.9.1
ptyprocess                            0.7.0
pyarrow                               8.0.0
pyasn1                                0.4.8
pyasn1-modules                        0.2.7
pycosat                               0.6.3
pycparser                             2.21
pycryptodome                          3.15.0
pydantic                              1.9.1
pydub                                 0.25.1
Pygments                              2.12.0
PyJWT                                 2.4.0
PyNaCl                                1.5.0
pyOpenSSL                             22.0.0
pyparsing                             3.0.9
pyrsistent                            0.18.1
PySocks                               1.7.1
python-dateutil                       2.8.2
python-multipart                      0.0.5
python-slugify                        6.1.2
pytz                                  2022.1
pyu2f                                 0.1.5
PyWavelets                            1.3.0
PyYAML                                6.0
pyzmq                                 23.2.0
regex                                 2022.9.13
requests                              2.28.1
requests-oauthlib                     1.3.1
retrying                              1.3.3
rfc3986                               1.5.0
rsa                                   4.8
ruamel-yaml-conda                     0.15.100
scikit-learn                          1.0.2
scipy                                 1.7.3
seaborn                               0.11.2
SecretStorage                         3.3.2
Send2Trash                            1.8.0
setuptools                            59.8.0
simpervisor                           0.4
six                                   1.16.0
smmap                                 3.0.5
sniffio                               1.2.0
soupsieve                             2.3.1
SQLAlchemy                            1.4.39
sqlparse                              0.4.2
starlette                             0.20.4
statsmodels                           0.13.2
tangled-up-in-unicode                 0.2.0
tenacity                              8.0.1
terminado                             0.15.0
text-unidecode                        1.3
textwrap3                             0.9.2
threadpoolctl                         3.1.0
tinycss2                              1.1.1
tokenizers                            0.13.1
toml                                  0.10.2
tomli                                 2.0.1
torch                                 1.11.0
torch-xla                             1.11
torchvision                           0.12.0+cu113
tornado                               6.1
tqdm                                  4.64.0
traitlets                             5.3.0
transformers                          4.23.1
triton                                2.0.0.dev20221014
typed-ast                             1.5.4
typing_extensions                     4.2.0
uc-micro-py                           1.0.1
ujson                                 5.3.0
unicodedata2                          14.0.0
Unidecode                             1.3.4
uritemplate                           4.1.1
urllib3                               1.26.9
uvicorn                               0.19.0
visions                               0.7.4
wcwidth                               0.2.5
webencodings                          0.5.1
websocket-client                      1.3.3
websockets                            10.3
wheel                                 0.37.1
wrapt                                 1.14.1
yarl                                  1.7.2
zipp                                  3.8.0
ProGamerGov commented 2 years ago

@ShivamShrirao Any ideas on why it doesn't work?

ShivamShrirao commented 2 years ago

Can't say. Btw you should install xformers. To check where script is hanging, press ctrl+C. The traceback will show where it was stuck.

ProGamerGov commented 2 years ago

@ShivamShrirao I did some more testing and it looks like hangs on the following line for some reason:

    if args.seed is not None:
        set_seed(args.seed)

When I omitted the seed parameter, everything worked.

ProGamerGov commented 2 years ago

Without using the seed parameter, it makes it up to this line before it stops working again:

accelerator.backward(loss)

I tried looking for similar issues:

https://github.com/huggingface/accelerate/issues/287

https://github.com/huggingface/accelerate/issues/191

But I'm not sure why its hanging on this line for this repo.

These are the parameters that I'm using:

export MODEL_NAME="runwayml/stable-diffusion-v1-5"
export VAE_NAME="stabilityai/sd-vae-ft-mse"
export INSTANCE_DIR="concept_images"
export CLASS_DIR="class_reg_images"
export OUTPUT_DIR="path-to-save-model"

accelerate launch train_dreambooth.py \
  --pretrained_model_name_or_path=$MODEL_NAME \
  --pretrained_vae_name_or_path=$VAE_NAME \
  --instance_data_dir=$INSTANCE_DIR \
  --class_data_dir=$CLASS_DIR \
  --output_dir=$OUTPUT_DIR \
  --with_prior_preservation --prior_loss_weight=1.0 \
  --instance_prompt="a photo of sks <concept>" \
  --class_prompt="<concept>" \
  --save_sample_prompt="photo of sks <concept>" \
  --resolution=512 \
  --train_batch_size=1 \
  --gradient_accumulation_steps=1 \
  --learning_rate=5e-6 \
  --lr_scheduler="constant" \
  --lr_warmup_steps=0 \
  --num_class_images=1290 \
  --save_interval=500 \
  --max_train_steps=25000 \
  --train_text_encoder \
  --mixed_precision="no" \
  --not_cache_latents

Edit:

This issue may be related? https://github.com/pytorch/pytorch/issues/85841

ProGamerGov commented 2 years ago

Can't say. Btw you should install xformers. To check where script is hanging, press ctrl+C. The traceback will show where it was stuck.

I need to find a pre-compiled xformers binary for the A100 40GB card first.

Edit:

I just tried using this version of xformers and got the same issue:

pip install -q https://github.com/TheLastBen/fast-stable-diffusion/raw/main/precompiled/A100/xformers-0.0.13.dev0-py3-none-any.whl

user@instance-1:~$ sh launch.sh
The following values were not passed to `accelerate launch` and had defaults used instead:
        `--num_cpu_threads_per_process` was set to `6` to improve out-of-box performance
To avoid this warning pass in values for each of the problematic parameters or run `accelerate config`.
WARNING:root:A matching Triton is not available, some optimizations will not be enabled.
Error caught was: No module named 'triton'
Downloading: 100%|ā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆ| 1.06M/1.06M [00:00<00:00, 2.89MB/s]
Downloading: 100%|ā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆ| 525k/525k [00:00<00:00, 2.18MB/s]
Downloading: 100%|ā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆ| 472/472 [00:00<00:00, 471kB/s]
Downloading: 100%|ā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆ| 806/806 [00:00<00:00, 848kB/s]
Downloading: 100%|ā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆ| 617/617 [00:00<00:00, 607kB/s]
Downloading: 100%|ā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆ| 492M/492M [00:06<00:00, 72.7MB/s]
Downloading: 100%|ā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆ| 335M/335M [00:04<00:00, 73.6MB/s]
Downloading: 100%|ā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆ| 547/547 [00:00<00:00, 522kB/s]
Downloading: 100%|ā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆ| 3.44G/3.44G [00:48<00:00, 70.2MB/s]
Downloading: 100%|ā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆ| 743/743 [00:00<00:00, 721kB/s]
Steps:   0%|                                                                      | 0/25000 [00:00<?, ?it/s]
ProGamerGov commented 2 years ago

I was able to get it working!

I created a file called environment.yaml and put this inside:

name: ldm
channels:
  - pytorch
  - defaults
dependencies:
  - python=3.8.10
  - pip=20.3
  - cudatoolkit=11.3
  - pip:
    - git+https://github.com/ShivamShrirao/diffusers.git
    - accelerate==0.12.0
    - torchvision
    - transformers>=4.21.0
    - ftfy
    - tensorboard
    - modelcards

Next I ran:

conda env create -f environment.yaml

Followed by:

conda activate ldm

After running the dreambooth script, it finnally gave be an error:

NVIDIA A100-SXM4-40GB with CUDA capability sm_80 is not compatible with the current PyTorch installation.
The current PyTorch install supports CUDA capabilities sm_37 sm_50 sm_60 sm_70.
If you want to use the NVIDIA A100-SXM4-40GB GPU with PyTorch, please check the instructions at https://pytorch.org/get-started/locally/

So, ran the following code and now the dreambooth script seems to work!

conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch
ProGamerGov commented 2 years ago

I'm having trouble repeating my above success, even when using the exact same commands:

wget -q https://github.com/ShivamShrirao/diffusers/raw/main/examples/dreambooth/train_dreambooth.py
conda env create -f conda.yaml
conda activate ldm
huggingface-cli login
pip install -q https://github.com/TheLastBen/fast-stable-diffusion/raw/main/precompiled/A100/xformers-0.0.13.dev0-py3-none-any.whl

accelerate config

pip install triton
conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch
sh launch.sh
sh launch.sh
The following values were not passed to `accelerate launch` and had defaults used instead:
        `--num_cpu_threads_per_process` was set to `6` to improve out-of-box performance
To avoid this warning pass in values for each of the problematic parameters or run `accelerate config`.
libc10_cuda.so: cannot open shared object file: No such file or directory
WARNING:root:WARNING: libc10_cuda.so: cannot open shared object file: No such file or directory
Need to compile C++ extensions to get sparse attention suport. Please run python setup.py build develop
Downloading: 100%|ā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆ| 1.06M/1.06M [00:00<00:00, 2.89MB/s]
Downloading: 100%|ā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆ| 525k/525k [00:00<00:00, 2.17MB/s]
Downloading: 100%|ā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆ| 472/472 [00:00<00:00, 494kB/s]
Downloading: 100%|ā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆ| 806/806 [00:00<00:00, 856kB/s]
Downloading: 100%|ā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆ| 617/617 [00:00<00:00, 619kB/s]
Downloading: 100%|ā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆ| 492M/492M [00:04<00:00, 100MB/s]
Downloading: 100%|ā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆ| 335M/335M [00:03<00:00, 100MB/s]
Downloading: 100%|ā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆ| 547/547 [00:00<00:00, 530kB/s]
Downloading: 100%|ā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆ| 3.44G/3.44G [00:35<00:00, 98.2MB/s]
Downloading: 100%|ā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆā–ˆ| 743/743 [00:00<00:00, 678kB/s]
Steps:   0%|                                                                      | 0/25000 [00:00<?, ?it/s]Traceback (most recent call last):
  File "train_dreambooth.py", line 765, in <module>
    main()
  File "train_dreambooth.py", line 712, in main
    noise_pred = unet(noisy_latents, timesteps, encoder_hidden_states).sample
  File "/opt/conda/envs/ldm/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/opt/conda/envs/ldm/lib/python3.8/site-packages/diffusers/models/unet_2d_condition.py", line 296, in forward
    sample, res_samples = downsample_block(
  File "/opt/conda/envs/ldm/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/opt/conda/envs/ldm/lib/python3.8/site-packages/diffusers/models/unet_blocks.py", line 563, in forward
    hidden_states = attn(hidden_states, context=encoder_hidden_states)
  File "/opt/conda/envs/ldm/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/opt/conda/envs/ldm/lib/python3.8/site-packages/diffusers/models/attention.py", line 169, in forward
    hidden_states = block(hidden_states, context=context)
  File "/opt/conda/envs/ldm/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/opt/conda/envs/ldm/lib/python3.8/site-packages/diffusers/models/attention.py", line 218, in forward
    hidden_states = self.attn1(self.norm1(hidden_states)) + hidden_states
  File "/opt/conda/envs/ldm/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "/opt/conda/envs/ldm/lib/python3.8/site-packages/diffusers/models/attention.py", line 291, in forward
    hidden_states = xformers.ops.memory_efficient_attention(query, key, value)
  File "/opt/conda/envs/ldm/lib/python3.8/site-packages/xformers/ops.py", line 617, in memory_efficient_attention
    op = AttentionOpDispatch.from_arguments(
  File "/opt/conda/envs/ldm/lib/python3.8/site-packages/xformers/ops.py", line 580, in op
    raise NotImplementedError(f"No operator found for this attention: {self}")
NotImplementedError: No operator found for this attention: AttentionOpDispatch(dtype=torch.float32, device=device(type='cpu'), k=40, has_dropout=False, attn_bias_type=<class 'NoneType'>, kv_len=4096, q_len=4096)
Steps:   0%|                                                                      | 0/25000 [00:04<?, ?it/s]
Traceback (most recent call last):
  File "/opt/conda/envs/ldm/bin/accelerate", line 8, in <module>
    sys.exit(main())
  File "/opt/conda/envs/ldm/lib/python3.8/site-packages/accelerate/commands/accelerate_cli.py", line 43, in main
    args.func(args)
  File "/opt/conda/envs/ldm/lib/python3.8/site-packages/accelerate/commands/launch.py", line 837, in launch_command
    simple_launcher(args)
  File "/opt/conda/envs/ldm/lib/python3.8/site-packages/accelerate/commands/launch.py", line 354, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['/opt/conda/envs/ldm/bin/python', 'train_dreambooth.py',

Same error reported here: https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/1975

Seems like its a PyTorch version issue: https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/576#issuecomment-1250136231


The new error I have was reported here as well: https://github.com/ShivamShrirao/diffusers/issues/26

ProGamerGov commented 2 years ago

Looking at the log for when I succeeded, I see the following PyTorch / Cuda versions:

torchvision        pytorch/linux-64::torchvision-0.13.1-py38_cu113 None
pytorch            pytorch/linux-64::pytorch-1.12.1-py3.8_cuda11.3_cudnn8.3.2_0 None

pytorch-1.12.1
torchvision-0.13.1  

So, maybe the versions are somehow getting messed up?

ProGamerGov commented 2 years ago

I think that it may have been the PyTorch version. I tried using the this environment.yaml file:

name: ldm
channels:
  - pytorch
  - defaults
dependencies:
  - python=3.8.10
  - pip=20.3
  - cudatoolkit=11.3
  - pytorch=1.12.1
  - torchvision=0.13.1
  - pip:
    - git+https://github.com/ShivamShrirao/diffusers.git
    - triton
    - accelerate==0.12.0
    - torchvision
    - transformers>=4.21.0
    - ftfy
    - tensorboard
    - modelcards

And I used it as part of these commands:

wget -q https://github.com/ShivamShrirao/diffusers/raw/main/examples/dreambooth/train_dreambooth.py
conda env create -f environment.yaml
conda activate ldm
pip install -q https://github.com/TheLastBen/fast-stable-diffusion/raw/main/precompiled/A100/xformers-0.0.13.dev0-py3-none-any.whl
huggingface-cli login

And it worked!

zetyquickly commented 1 year ago

@ProGamerGov what base docker image did you use in that case?