rohitgandikota / sliders

Concept Sliders for Precise Control of Diffusion Models
https://sliders.baulab.info
MIT License
976 stars 78 forks source link

Negative LoRA values overly change image - How to make bipolar LoRA? #96

Open ItsAuver opened 5 months ago

ItsAuver commented 5 months ago

Hello, I am having some issues when training a bipolar LoRA slider - a LoRA that supports negative and positive values. My goal is for negative values to yield to opposite of positive values, i.e. for a height slider, negative values make the subject shorter while positive values make the subject taller.

Training the slider, the positive values work well and as expected, but using negative values will desaturate and change the style of the image quite a bit, and often times it doesn't seem to encapsulate the desired LoRA effect well. I am training on the base v6 model of PonyXL.

Below is the configuration for a height slider.

Config-xl.yml:

pretrained_model:
  name_or_path: "ponyDiffusionV6XL.safetensors" # you can also use .ckpt or .safetensors models
  v2: false # true if model is v2.x
  v_pred: false # true if model uses v-prediction
network:
  type: "c3lier" # or "c3lier" or "lierla"
  rank: 4
  alpha: 1
  training_method: "noxattn"
train:
  precision: "bfloat16"
  noise_scheduler: "ddpm" # or "ddpm", "lms", "euler_a"
  iterations: 1000
  lr: 0.0002
  optimizer: "AdamW"
  lr_scheduler: "constant"
  max_denoising_steps: 60
save:
  name: "temp"
  path: "./models"
  per_steps: 250
  precision: "bfloat16"
logging:
  use_wandb: false
  verbose: false
other:
  use_xformers: true

Example prompts file:

- target: "1boy" # what word for erasing the positive concept from
  positive: "tall" # concept to erase
  unconditional: "short" # word to take the difference from the positive concept
  neutral: "1boy" # starting point for conditioning the target
  action: "enhance" # erase or enhance
  guidance_scale: 4
  rank: 4
  resolution: 512
  dynamic_resolution: false
  batch_size: 1
- target: "1boy" # what word for erasing the positive concept from
  positive: "very tall" # concept to erase
  unconditional: "very short" # word to take the difference from the positive concept
  neutral: "1boy" # starting point for conditioning the target
  action: "enhance" # erase or enhance
  guidance_scale: 4
  rank: 4
  resolution: 512
  dynamic_resolution: false
  batch_size: 1
- target: "1girl" # what word for erasing the positive concept from
  positive: "very tall" # concept to erase
  unconditional: "very short" # word to take the difference from the positive concept
  neutral: "1girl" # starting point for conditioning the target
  action: "enhance" # erase or enhance
  guidance_scale: 4
  rank: 4
  resolution: 512
  dynamic_resolution: false
  batch_size: 1

Command-line argument to start training:

python trainscripts/textsliders/train_lora_xl.py --attributes "1girl, 1boy, height" --name 'height-slider-v10' --rank 4 --alpha 1 --config_file "trainscripts/textsliders/data/config-xl.yaml"

I have played around quite a bit with the prompting to no avail. Is it better to train two slider LoRAs, and then carefully combine them to make a bipolar LoRA slider? Am I just messing something up with the config, or is training on non-base SDXL models not supported? Thanks.

rohitgandikota commented 5 months ago

hey @ItsAuver

As you tell, the positive seems to work. So I am guessing it's not an issue with the model checkpoint. Some suggestions that I can give from first glance:

  1. You should use --attributes parameter in your training command for attributes you DO NOT want to change. For instance the race or gender of a person. Given you explicitly mentioned the 1boy and 1girl in your prompts, you can either remove it or add attributes that you do not want to change like race attributes or background attributes (eg. "colorful background, bokeh background, white, black, asian")

  2. in your prompts file I would make a slight change:

    - target: "1boy" # what word for erasing the positive concept from
    positive: "1boy, very tall" # concept to erase
    unconditional: "1boy, very short" # word to take the difference from the positive concept
    neutral: "1boy" # starting point for conditioning the target
    action: "enhance" # erase or enhance
    guidance_scale: 4
    rank: 4
    resolution: 512
    dynamic_resolution: false
    batch_size: 1

    I would add the target prompt to the positive and unconditional too.

Let me know if these hacks work. All the best!