Open SmileTAT opened 1 day ago
hi @SmileTAT what do you mean by "red layers"? Would you mind sharing some screenshots?
input and output images @zengyh1900
input and output images @zengyh1900
differences between dev and main branch 1. dev: conditioning_latents = torch.concat([mask, conditioning_latents], 1) main: conditioning_latents = torch.concat([conditioning_latents, mask], 1) 2. dev: original_mask = (original_mask.sum(1)[:, None, :, :] > 0).to(image.dtype) main: original_mask = (original_mask.sum(1)[:, None, :, :] < 0).to(image.dtype)
at dev brach, init pipeline as following code, but the output image is covered with a red layer `# brushnet-based version unet = UNet2DConditionModel.from_pretrained( "stable-diffusion-v1-5/stable-diffusion-v1-5", subfolder="unet", revision=None, torch_dtype=weight_dtype, ) text_encoder = CLIPTextModel.from_pretrained( "stable-diffusion-v1-5/stable-diffusion-v1-5", subfolder="text_encoder", revision=None, torch_dtype=weight_dtype, ) brushnet = BrushNetModel.from_unet(unet)