Closed zhangvia closed 9 months ago
The dim and the num block of resnets, you can reduce them and keep them same by zero-conv at connection I guess.
The dim and the num block of resnets, you can reduce them and keep them same by zero-conv at connection I guess.
i just compare the config file of controlnet-xs and original controlnet. they look almost the same: the config file are from sd21_encD_canny_14m.yaml and cldm_v15.yaml
control_stage_config:
target: ldm.modules.diffusionmodules.twoStreamControl.TwoStreamControlNet
params:
use_checkpoint: true
image_size: 32
in_channels: 4
out_channels: 4
hint_channels: 3
model_channels: 320
attention_resolutions:
- 4
- 2
- 1
num_res_blocks: 2
channel_mult:
- 1
- 2
- 4
- 4
num_head_channels: 8
use_spatial_transformer: true
use_linear_in_transformer: true
transformer_depth: 1
context_dim: 1024
legacy: false
infusion2control: cat
infusion2base: add
guiding: encoder_double
two_stream_mode: cross
control_model_ratio: 0.0125
control_stage_config:
target: cldm.cldm.ControlNet
params:
image_size: 32 # unused
in_channels: 4
hint_channels: 3
model_channels: 320
attention_resolutions: [ 4, 2, 1 ]
num_res_blocks: 2
channel_mult: [ 1, 2, 4, 4 ]
num_heads: 8
use_spatial_transformer: True
transformer_depth: 1
context_dim: 768
use_checkpoint: True
legacy: False
maybe the config file sd21_encD_canny_14m.yaml just add the connections between controlnet encoder and unet encoder?
The out_channels of control model are reduced from 320 to 4
i noticed that. but controlnet-xs code doesn't use the out_channels in init function. all res blocks out_channels are computed using model_channels which is same as the original controlnet code
The dim and the num block of resnets, you can reduce them and keep them same by zero-conv at connection I guess.
infusion_factor = int(1 / control_model_ratio)
i see the blog. but it seems like that model A,B are same as the original controlnet. both them have a complete encoder of unet. and model C has a whole unet. why do they have smaller weights than original controlnet?