Closed defnotkenski closed 2 months ago
you'd better cache all the latents and te, expecially te.
Indeed, if you dont cache te, on each batch, the whole text encoder must be inside GPU, that costs a huge amout of meomory.
close the highvram, and use xformers may help as well
With cached output, both latent and text condition, you can finetune Flux with 80GB GPU.
I tested 1 x A100 80GB, adamw8bit.
With cached output, both latent and text condition, you can finetune Flux with 80GB GPU.
I tested 1 x A100 80GB, adamw8bit.
you'd better cache all the latents and te, expecially te.
Indeed, if you dont cache te, on each batch, the whole text encoder must be inside GPU, that costs a huge amout of meomory.
close the highvram, and use xformers may help as well
Thanks, that did the trick. Ran into another error but at least it's not a memory issue anymore haha.
Hey guys,
I'm getting CUDA out of memory errors while using the
flux_train.py
script with 80Gb of VRAM using Prodigy.Any configuration recommendations to get this working? Thanks.
The error in question:
The executed command:
My complete configuration:
My Accelerate config: