Open TutajITeraz opened 1 year ago
--gradient_checkpointing?
i had turned it on when i was trying to perform train_text_encoder, but without it error is the same.
I don't think you are able to train text encoder with 12Gb mem. I'm happy to test it if you have a working command line.
I hope someone here colud help me with providing any settings that should work on 12GB without oom error?
I tried standard settings from latest release, without any changes, and it does not work either.
I hope someone here colud help me with providing any settings that should work on 12GB without oom error?
I tried standard settings from latest release, without any changes, and it does not work either.
I always train encoder with my 3060. You just need to go to your task manager and end any program with GPU usage (if possible). Close all browsers ofc. You need every inch of VRam, but it wrks fine, for me at least. Dont forget gradient parameter.
I don't think you are able to train text encoder with 12Gb mem. I'm happy to test it if you have a working command line.
Is possible. I always do that, in ur GUI and even in Auto's GUI. Just a matter of clean VRam.
So is 11.34gb not enough? How can I free more VRAM?
So is 11.34gb not enough? How can I free more VRAM?
Already said. Close everything but the GUI. Also, look at the task manager to see wich sub process use GPU.
It runs fine. I have a similar setup. Do you mind run nvtop to make sure you have 11+ vram before running the GUI?
GUI only detect available mem once, If you open something after starting the GUI, it may use wrong parameters.
I will put nvtop output this afternoon. As i recall it was around 11.7 GB free with lower screen resolution, gnome-classic, only gui and console opened.
I think the real problem is here:
Tried to allocate 2.25 GiB (GPU 0; 11.75 GiB total capacity; 8.05 GiB already allocated; 113.25 MiB free; 9.83 GiB reserved in total by PyTorch
If i add 2.25 to 9.83 already reserved = 12.08 GiB so it is impossible to satisfy on rtx 3060 ? Or should i add only already alocated 8.05 to 2.25, and then 10.3 GiB should not be a problem. It does not add up, or my calculations are wrong.
I'm on 12gb VRAM (RTX 3060), and I can train with the text encoder, it's tight though as it uses ~96% of the available VRAM, but it works. I'm on Arch Linux, using i3 as the window manager. My settings are as follows:
--mixed_precision=fp16 --train_batch_size=1 --gradient_accumulation_steps=1 --use_8bit_adam --resolution=512 --gradient_checkpointing --train_text_encoder --seed=96576 --num_class_images=1000
Describe the bug
I'm getting an Out of memory error, no matter which settings i'm trying to use.
I have turned off all the apps, browser, and turned off external screens to free as much VRAM i can.
To Reproduce My settings are:
Desktop (please complete the following information):