On my RTX 3090 + 32Gb RAM machine I'm able to train a FLUX-LoRA just fine, and full-finetuning also trains (using adafactor), however the script crashes when trying to save the first full checkpoint due to insufficient (cpu) RAM. Is there any way to reduce the peak memory usage when saving the transformer checkpoint to disk?
Please add --mem_eff_save option. This uses a custom implementation of the model saving function instead of safetensors library to reduce memory consumption when saving. Please reopen if the issue remains.
On my RTX 3090 + 32Gb RAM machine I'm able to train a FLUX-LoRA just fine, and full-finetuning also trains (using adafactor), however the script crashes when trying to save the first full checkpoint due to insufficient (cpu) RAM. Is there any way to reduce the peak memory usage when saving the transformer checkpoint to disk?
I'm using the following sepcs: