Open zhangvia opened 1 year ago
Take the masked image as control image, and have the model predicts the full or original unmasked image.
Hi, Is there any resource or scripts that can help training ControlNet Inpainting model?
Take the masked image as control image, and have the model predicts the full or original unmasked image.
i tried this. BUT the output have noting to do with my control (the masked image). here is condition
control
reconstruction
but the output is as below:
to train a inpaint controlnet,DDPM and DDIM, which one i should use?
I am also training to train the inpainting controlnet. But until now, I haven't successfully achieved it. Maybe you need to first read the code in gradio_inpainting.py and you will get should use -1 to mask the nomalized image.
(I guess)
Take the masked image as control image, and have the model predicts the full or original unmasked image.
i tried this. BUT the output have noting to do with my control (the masked image). here is condition control reconstruction but the output is as below:
@Hubert2102 I am not sure whether you need solution. I faced similar problem and found solution as well.
Problem : In painting tries to repaint the whole scenario instead of jus in painting the masked portion
Solution :
Your control image, Masked image and prompt are perfect.
Dont create / load control_sd15_ini.ckpt file we generally do for Control Net training.Instead, as @tonyhuang2022 said .. while loading model for training you have to use the code from gradio_inpainting.py
ie. model = create_model('PATH/TO/control_v11p_sd15_inpaint.yaml') model.load_state_dict(load_state_dict(PATH/TO/v1-5-pruned.ckpt') , strict = False) model.load_state_dict(load_state_dict(PATH/TO/control_v11p_sd15_inpaint.pth) , strict = False)
Hope this helps.
I am also training to train the inpainting controlnet. But until now, I haven't successfully achieved it. Maybe you need to first read the code in gradio_inpainting.py and you will get should use -1 to mask the nomalized image.
(I guess)
- when sampling the image, we need to add mask to the process.
- when we get the final outputs, we should also use the mask to process the generated images.
Your words helped me to crack this on time @tonyhuang2022 . I am not sure why it fails for you? Let me know if you still have problem. I can guide. Check my comment to @Hubert2102
@Lakshmanaraja > Hi there, I want to fine-tune my ControlNet on my fine-tuned SD 1.5 inpainting models. However, I found that after 100 steps. My inpainting area became all black. Could you mind sharing some advice?
BTW, I modified the diffusers codes. The control image is the masked image with -1 on the mask area. It would be great if you can share you scripts.
@Lakshmanaraja > Hi there, I want to fine-tune my ControlNet on my fine-tuned SD 1.5 inpainting models. However, I found that after 100 steps. My inpainting area became all black. Could you mind sharing some advice?
BTW, I modified the diffusers codes. The control image is the masked image with -1 on the mask area. It would be great if you can share you scripts.
if the whole image become black , it means Vanishing Gradient issue. It may be due to GPU not compatible with Xformers ( ex : A6000 is not compatible. A5000 is compatible. I am sharing this by my personal experience in paperspace. But I dont have any specific list of GPU instance which is compatible with Xformers or not ). So check what server you are using.
If only masked area is black , It is good only. You need to train more.
Note : I didnt changes mask area to -1 in control image. I just made it white space. ie assigning the 255 in the numpy array.
pil_array = np.array(Image.open(
*xmin,xmax,ymin,ymax represent the area to be masked.
Note : I am still using ControlNet1.0 training Code only with small changes i mentioned in my reply to @Hubert2102 but i use the in painting model from control net 1.1
i am also facing this issue. I am doing this:
the training script is the one used by diffusers txt2image controlnet
.
is the logic correct, normally in training phase, we start with the target image (that needs to be made) , add noise to it iteratively while the control net gets the conditional image. unet predicts noise that gets into loss function. Is this same in case of inpaint-controlnet train ?
i am also facing this issue. I am doing this:
the training script is the one used by
diffusers txt2image controlnet
.is the logic correct, normally in training phase, we start with the target image (that needs to be made) , add noise to it iteratively while the control net gets the conditional image. unet predicts noise that gets into loss function. Is this same in case of inpaint-controlnet train ?
Hello, we are encountering the same problem now, have you solved this problem?
I'm only making guesses here but some things that come to mind:
--proportion_empty_prompts=0.5
which makes a lot of sense here as the very goal of inpainting is to guess unknown areas.here's a baseline for all your inpaint models: https://github.com/lllyasviel/ControlNet/discussions/561
Is it normal for your trained models to show discoloration or slight changes in areas that are not masked? How do you keep these areas exactly as they are?
Is it normal for your trained models to show discoloration or slight changes in areas that are not masked? How do you keep these areas exactly as they are?
True. It is quite challenging. Did you able to solve this?
When i retrain the inpaint CN, I get the bad result, did i have any mistakes? control: mask area pixel is zero. ground truth: model result:
The scheduler is a critical factor that contributes to stability. I recommend exploring the use of DDIM or DDPM as your scheduling approach.
Take the masked image as control image, and have the model predicts the full or original unmasked image.
Doesn't the model also requires the mask itself to be inputted?
@geroldmeisinger
i've resume controlnet inpainting 100 dataset but the result like this
this is mask, i just want invert mask, not the subject but the background
or should i make mask like this?
could you please share your detail about training inapint controlnet model? did you just take the masked image as the input of controlnet and force the model to predict the mask?