Open Xynonners opened 7 months ago
Hi, I'm interested in understanding what the code does.
**easyblock("model.diffusion_model.output_blocks.6.0", "P_bg208","P_bg209"), **conv("model.diffusion_model.output_blocks.6.0.skip_connection","P_bg210","P_bg211"), **norm("model.diffusion_model.output_blocks.6.1.norm", "P_bg212"), **conv("model.diffusion_model.output_blocks.6.1.proj_in", "P_bg212", "P_bg213"), **dense("model.diffusion_model.output_blocks.6.1.transformer_blocks.0.attn1.to_q", "P_bg214", "P_bg215", bias=False),
in some sections, the bg's are disconnected in the sense that each layer has a unique set (no overlap)
**dense("model.diffusion_model.output_blocks.6.1.transformer_blocks.0.attn2.to_out.0", "P_bg224","P_bg225", bias=True), **norm("model.diffusion_model.output_blocks.6.1.transformer_blocks.0.norm1", "P_bg225"), **norm("model.diffusion_model.output_blocks.6.1.transformer_blocks.0.norm2", "P_bg225"), **norm("model.diffusion_model.output_blocks.6.1.transformer_blocks.0.norm3", "P_bg225"), **conv("model.diffusion_model.output_blocks.6.1.proj_out", "P_bg225", "P_bg226"),
but in other sections, (like the one including this norm), the bg's are connected (overlap of P_bg255).
**norm("cond_stage_model.transformer.text_model.encoder.layers.1.layer_norm2", "P_bg375"), **dense("cond_stage_model.transformer.text_model.encoder.layers.2.self_attn.k_proj", "P_bg375", "P_bg376",bias=True), **dense("cond_stage_model.transformer.text_model.encoder.layers.2.self_attn.v_proj", "P_bg375", "P_bg376",bias=True), **dense("cond_stage_model.transformer.text_model.encoder.layers.2.self_attn.q_proj", "P_bg375", "P_bg376",bias=True), **dense("cond_stage_model.transformer.text_model.encoder.layers.2.self_attn.out_proj", "P_bg375", "P_bg376",bias=True), **norm("cond_stage_model.transformer.text_model.encoder.layers.2.layer_norm1", "P_bg376"), **dense("cond_stage_model.transformer.text_model.encoder.layers.2.mlp.fc1", "P_bg376", "P_bg377", bias=True), **dense("cond_stage_model.transformer.text_model.encoder.layers.2.mlp.fc2", "P_bg377", "P_bg378", bias=True), **norm("cond_stage_model.transformer.text_model.encoder.layers.2.layer_norm2", "P_bg378"), **dense("cond_stage_model.transformer.text_model.encoder.layers.3.self_attn.k_proj", "P_bg378", "P_bg379",bias=True),
in the text encoder, everything seems connected since everything has a norm.
**conv("model.diffusion_model.output_blocks.10.0.skip_connection","P_bg288","P_bg289"), **norm("model.diffusion_model.output_blocks.10.1.norm", "P_bg290"), **conv("model.diffusion_model.output_blocks.10.1.proj_in", "P_bg290", "P_bg291"),
but here in the unet, the norm seems to be disconnected on one side? P_bg289 -> P_bg290?
could you explain to me the reasoning behind these choices? thanks (I'm trying to figure out how to implement rebasin correctly)
@AI-Casanova
@Xynonners
Well, it's been a while but I'm pretty sure I disconnected them because the weight matching algorithm wouldn't compute when they were connected.
Hi, I'm interested in understanding what the code does.
in some sections, the bg's are disconnected in the sense that each layer has a unique set (no overlap)
but in other sections, (like the one including this norm), the bg's are connected (overlap of P_bg255).
in the text encoder, everything seems connected since everything has a norm.
but here in the unet, the norm seems to be disconnected on one side? P_bg289 -> P_bg290?
could you explain to me the reasoning behind these choices? thanks (I'm trying to figure out how to implement rebasin correctly)