Closed N1cekiko closed 1 year ago
cc @sayakpaul here
Could you paste the full error trace? Will need to check where does this stem from.
Because for a 32 GB V100 card, this seems weird.
I noticed that this bug is the same as the one mentioned in issue2094(https://github.com/huggingface/diffusers/issues/3094) , it has been closed though the prolem is not solved, could you please open the issue again?
I noticed that this bug is the same as the one mentioned in issue2094(https://github.com/huggingface/diffusers/issues/3094) , it has been closed though the prolem is not solved, could you please open the issue again?
We closed the issue because of https://github.com/huggingface/diffusers/issues/3094#issuecomment-1519293627.
Could you follow the suggestions shared in that thread and let us know if the issues still persist?
I have tried, but still not working.
Then please post a detailed thread including what have you already tried and their error logs. Without complete and comprehensive details, it's hard for us to suggest properly.
Thanks for your kind reply and patience. Here are the launch scipts and the logs:
Interesting.
Which library versions are you using?
@pcuenca if you have access to a 24 GB card, could you try to reproduce it once? I currently don't have access to one.
I just git clone the latest master branch, and install diffusers locally; Then install the requirements for text_to_image.
And what Torch version are you? Have you tried using memory efficient attention with xformers?
My tests:
--enable_xformers_memory_efficient_attention
) works fine and takes only ~14 GB of GPU RAM.When using PyTorch 2 I verified that we are using AttnProcessor2_0
here: https://github.com/huggingface/diffusers/blob/c6ae8837512d0572639b9f57491d4482fdc8948c/src/diffusers/models/attention_processor.py#L161. I'm not sure what's the reason for not fitting in 24 GB any more.
I used pytorch1.11, without Xformers.
Could you try with PyTorch 1.13.1 and xformers?
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
I also encounter this problem, and degrading my Pytorch version doesn't work. Then I found something, it seems that accelerate always works in the base environment. So I abandon my conda virtual environment and degrade torch in (base). It works!!!!!
Describe the bug
when I ran the script: examples/text_to_image/text_to_image.py, using the follwing command:
export MODEL_NAME="CompVis/stable-diffusion-v1-4" export dataset_name="lambdalabs/pokemon-blip-captions"
python train_text_to_image.py \ --pretrained_model_name_or_path=$MODEL_NAME \ --dataset_name=$dataset_name \ --use_ema \ --resolution=512 --center_crop --random_flip \ --train_batch_size=1 \ --gradient_accumulation_steps=4 \ --gradient_checkpointing \ --max_train_steps=15000 \ --learning_rate=1e-05 \ --max_grad_norm=1 \ --lr_scheduler="constant" --lr_warmup_steps=0 \ --output_dir="sd-pokemon-model" \ --mixed_precision="fp16"
I tried to decrease the resolution and remove --center_crop --random_flip, it did not work.
The hardware I used: V100, 32GB pytorch1.11
logs:
Reproduction
train_1p.txt
Logs
No response
System Info
diffusers
version: 0.17.0.dev0