kohya-ss / sd-scripts

Apache License 2.0
5.03k stars 843 forks source link

Specify blocks to Train w/ Finetuning? #1636

Open setothegreat opened 1 week ago

setothegreat commented 1 week ago

Since it appears that Flux LoRA training can still be effective when only training specific layers, I am wondering if this functionality can be expanded to Finetuning, since this is where the biggest roadblocks pertaining to speed and hardware currently lie. Rather than being limited to Adafactor and dozens of hours per training iteration, being able to specify a subset of layers to train seems like it should lower hardware requirements, thus allowing the use of potentially more efficient optimizers on consumer-grade hardware, and could bring the training time down by an order of magnitude.

Is there some sort of architecture-level roadblock that prevents specific layer training when doing a full finetune that doesn't exist when training a LoRA that I'm not aware of?

kohya-ss commented 1 week ago

The model parameters need to be stored in VRAM in bf16, which consumes 22GB of VRAM (block swap is implemented to reduce that). Therefore, training some layers will not help much in reducing VRAM.