Add support for BitFit - Githubissues

Paper: https://aclanthology.org/2022.acl-short.1/

Summary (my words):

As a model trainer, it would be nice if we could use this directed policy optimization trainer to train just the bias of the U-net, keeping the weights frozen.

Initial testing shows that this approach allows us to carefully direct the model toward better details / aesthetics while maintaining most of the model's core structure.

Where full weight and bias tuning results in almost complete destruction of SD 2.1-v using just 8 images for finetuning, this method allows pushing past 400 epochs on the same dataset.

Example:

The starting point ^

After just 810 steps ^

This is without any DPO, simply finetuning based on MSE loss and velocity objective.

Comparison, the mode collapse of SD 2.1-v when tuning weights and bias which occurs in fewer steps:

This is using the same hyperparameters, eg. learning rate/scheduler/dataset/seeds.

CarperAI / DRLX

Add support for BitFit #31