IrisRainbowNeko / HCP-Diffusion

A universal Stable-Diffusion toolbox
Apache License 2.0
896 stars 75 forks source link

Bug Report: ConfigAttributeError: Missing key clip_skip #8

Closed iamwangyabin closed 1 year ago

iamwangyabin commented 1 year ago

Please add clip_skip=2 in the cfgs/infer/v1.yaml

Below is my inference config, using enma_ai's lora trained by HCP Diffusion.

pretrained_model: /home/yabin/HCP-Diffusion/converted_models/Acertain
prompt: enma_ai, 1girl, tree, solo, plant, bush, outdoors, grass, garden, nature,
  branch, day, sky,
neg_prompt: lowres, bad anatomy, bad hands, text, error, missing fingers, extra digit,
  fewer digits, cropped, worst quality, low quality, normal quality, jpeg artifacts,
  signature, watermark, username, blurry
out_dir: output/
emb_dir: embs/
N_repeats: 1
bs: 4
num: 1
seed: null
fp16: true
clip_skip: 2
save:
  save_cfg: true
  image_type: png
  quality: 95
infer_args:
  width: 512
  height: 768
  guidance_scale: 7.5
new_components: {}
merge:
  exp_dir: 2023-04-17-21-59-53
  alpha: 0.8
  group1:
    type: unet
    base_model_alpha: 1.0
    lora:
    - path: exps/${....exp_dir}/ckpts/unet-2000.safetensors
      alpha: ${....alpha}
      layers: all
      mask: null
    part: null
  group2:
    type: TE
    base_model_alpha: 1.0
    lora:
    - path: exps/${....exp_dir}/ckpts/text_encoder-2000.safetensors
      alpha: ${....alpha}
      layers: all
      mask: null
    part: null

Below is the generated image with ACertain, which is also the base model I trained on.

0-enma_ai, 1girl, tree, solo, plant, bush, outdoors, grass, garden, nature, branch, day, sky,

Currently, I am unable to evaluate the quality of this LoRa model compared to my previous LoRa model trained using sd-script, mainly because there are numerous training script settings in this repository that I do not understand. However, I am confident that this is an excellent project. The code is clean and easy to use, with support for training on Linux systems and multiple GPUs. There are many algorithms available (although I am not sure of the exact number of methods that have been implemented).

Expected to see convertion scripts to transfer this lora format to general lora format, which can be used in Webui with more controlable parameters.

By the way, I'm curious about the Rank parameter in the configuration file. Initially, I thought it meant dim in sd-script, but when I set it to 128, the final output model's size was more than 200MB (while this dim setting is only 144MB in sd-script). I also noticed that the scale parameter is available in the code, but currently, I am unable to set it using the configuration file. I believe that the scale is equivalent to 'alpha'.

IrisRainbowNeko commented 1 year ago

In this framework clip_skip=1 is equivalent to ckip_skip=2 in webui, here the ckip_skip=2 is the clip_skip=3 in webui.

IrisRainbowNeko commented 1 year ago

By the way, I'm curious about the Rank parameter in the configuration file. Initially, I thought it meant dim in sd-script, but when I set it to 128, the final output model's size was more than 200MB (while this dim setting is only 144MB in sd-script). I also noticed that the scale parameter is available in the code, but currently, I am unable to set it using the configuration file. I believe that the scale is equivalent to 'alpha'.

The file is probably larger because lora has been added for more layers, and the layers at which it is currently added lora are not yet aligned with sd-scripts. An sd-scripts version of the lora configuration file will be provided later.

For the scale in training, you can set scale like this:

lora_unet:
  -
    lr: 1e-4
    rank: 8
    scale: 0.8
    layers:
      - 're:.*\.attn.?$'
      - 're:.*\.ff\.net\.0$'

Maybe it would be clearer to replace all the scale with alpha.