p1atdev / LECO

Low-rank adaptation for Erasing COncepts from diffusion models.
https://arxiv.org/abs/2303.07345
Apache License 2.0
307 stars 23 forks source link

8GB VRAM GPU without bf16? #29

Closed sashasubbbb closed 1 year ago

sashasubbbb commented 1 year ago

Is there any way to run LECO on 8GB VRAM GPU which does not support bf16 (only fp32, fp16)? Trying to run fp16 immediatly results in loss=nan, as stated in #8 Running in fp32 results in OOM even on rank 1 with "full" training method. Trying to run on colab to see VRAM usage on fp32, it seems to spike to 12GB VRAM but then uses only 6GB most of time. Using Adam8bit with bitsandbytes instead of AdamW doesn't seem to reduce any VRAM.

Edit: using batch_size: 1 with fp32 seems to fit just fine in 8GB.