ShivamShrirao / diffusers

🤗 Diffusers: State-of-the-art diffusion models for image and audio generation in PyTorch
https://huggingface.co/docs/diffusers
Apache License 2.0
1.89k stars 505 forks source link

OutOfMemoryError on an RTX 2080 TI #108

Open RobinFrcd opened 1 year ago

RobinFrcd commented 1 year ago

Hi, I'm trying to make dreambooth work on my 2080 TI. I've compiled xformers with TORCH_CUDA_ARCH_LIST=7.5.

My GPU has about 10.5 GiB of available VRAM. However, when I run the training on my dataset, I get torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 30.00 MiB (GPU 0; 10.74 GiB total capacity; 8.94 GiB already allocated; 9.31 MiB free; 9.03 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.

I've tried to change --resolution=512 to --resolution=256 but it didn't help.

accelerate launch train_dreambooth.py \
  --pretrained_model_name_or_path=$MODEL_NAME \
  --instance_data_dir=$INSTANCE_DIR \
  --class_data_dir=$CLASS_DIR \
  --output_dir=$OUTPUT_DIR \
  --with_prior_preservation --prior_loss_weight=1.0 \
  --instance_prompt="a photo of sks person" \
  --class_prompt="a photo of person" \
  --resolution=256 \
  --train_batch_size=1 \
  --sample_batch_size=1 \
  --gradient_accumulation_steps=1 --gradient_checkpointing \
  --learning_rate=2e-6 \
  --lr_scheduler="constant" \
  --lr_warmup_steps=0 \
  --num_class_images=200 \
  --max_train_steps=800 \
  --mixed_precision=fp16 \
  --use_8bit_adam

Am I missing something here ? I thought it could run with 9.92 GiB.

Thanks !

ShivamShrirao commented 1 year ago

How did u install diffusers ?

RobinFrcd commented 1 year ago
pip install git+https://github.com/ShivamShrirao/diffusers.git
pip install -U -r requirements.txt
ShivamShrirao commented 1 year ago

Can u verify xformers is working ?

RobinFrcd commented 1 year ago

Sure, is there a way to verify this easily ?

ShivamShrirao commented 1 year ago

Try with just import xformers

RobinFrcd commented 1 year ago

Oh yes, sure, it works fine !

blackmagic24 commented 1 year ago

The same here on RTX3080 10GB. It does not work natively under Linux. (Ubuntu or Fedora)

diffusers install: pip install git+https://github.com/ShivamShrirao/diffusers.git pip install -U -r requirements.txt

xformers self compiled with cuda 11.6 conda install -c "nvidia/label/cuda-11.6.2" cuda

or from here: conda install -y -c pytorch -c conda-forge cudatoolkit=11.6 pytorch=1.12.1 conda install -y xformers -c xformers/label/dev

nothing works:

RuntimeError: CUDA out of memory. Tried to allocate 16.00 MiB (GPU 0; 9.78 GiB total capacity; 8.06 GiB already allocated; 15.31 MiB free; 8.15 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 520.56.06    Driver Version: 520.56.06    CUDA Version: 11.8     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0 Off |                  N/A |
|  0%   38C    P8    29W / 320W |      0MiB / 10240MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

The really strange thing is that it runs under Windows 11 WSL (ubuntu) without any problems. The same launch.sh script and conda environment.

WSL installation guide from here: https://pastebin.com/uE1WcSxD I would rather run it on Linux.

RobinFrcd commented 1 year ago

I've seen that PyTorch is not compatible with CUDA 11.8, maybe that's your issue here ?

blackmagic24 commented 1 year ago

I've seen that PyTorch is not compatible with CUDA 11.8, maybe that's your issue here ?

conda list cudatoolkit

# Name                    Version                   Build  Channel
cudatoolkit               11.6.0              hecad31d_10    conda-forge
blackmagic24 commented 1 year ago

deleted, not working

RobinFrcd commented 1 year ago

Very similar to what I've done, I'll give it another try, maybe there's an issue with my environment with all the tries I've done !

ZeroCool22 commented 1 year ago

Try with just import xformers

Screenshot_2

When i try import.

- `diffusers` version: 0.7.0.dev0
- Platform: Linux-5.10.102.1-microsoft-standard-WSL2-x86_64-with-glibc2.31
- Python version: 3.9.13
- PyTorch version (GPU?): 1.12.1+cu116 (True)
- Huggingface_hub version: 0.10.0
- Transformers version: 4.22.2
- Using GPU in script?: <fill in>
- Using distributed or parallel set-up in script?: <fill in>
RobinFrcd commented 1 year ago

import xformers should be done in the python interpreter. python -c "import xformers"

ZeroCool22 commented 1 year ago

import xformers should be done in the python interpreter. python -c "import xformers"

Screenshot_3

I do it in Ubuntu console, because i use with WSL2.