how did you train the inpaint controlnet model?

zhangvia commented 1 year ago

could you please share your detail about training inapint controlnet model? did you just take the masked image as the input of controlnet and force the model to predict the mask?

xiankgx commented 1 year ago

Take the masked image as control image, and have the model predicts the full or original unmasked image.

umair-imran commented 1 year ago

Hi, Is there any resource or scripts that can help training ControlNet Inpainting model?

Hubert2102 commented 1 year ago

Take the masked image as control image, and have the model predicts the full or original unmasked image.

i tried this. BUT the output have noting to do with my control (the masked image). here is condition

control

reconstruction

but the output is as below:

Hubert2102 commented 1 year ago

to train a inpaint controlnet，DDPM and DDIM, which one i should use?

tonyhuang2022 commented 1 year ago

I am also training to train the inpainting controlnet. But until now, I haven't successfully achieved it. Maybe you need to first read the code in gradio_inpainting.py and you will get should use -1 to mask the nomalized image.

(I guess)

when sampling the image, we need to add mask to the process.
when we get the final outputs, we should also use the mask to process the generated images.

Lakshmanaraja commented 1 year ago

Take the masked image as control image, and have the model predicts the full or original unmasked image.

i tried this. BUT the output have noting to do with my control (the masked image). here is condition control reconstruction but the output is as below:

@Hubert2102 I am not sure whether you need solution. I faced similar problem and found solution as well.

Problem : In painting tries to repaint the whole scenario instead of jus in painting the masked portion

Solution :

Your control image, Masked image and prompt are perfect.
Dont create / load control_sd15_ini.ckpt file we generally do for Control Net training.Instead, as @tonyhuang2022 said .. while loading model for training you have to use the code from gradio_inpainting.py

ie. model = create_model('PATH/TO/control_v11p_sd15_inpaint.yaml') model.load_state_dict(load_state_dict(PATH/TO/v1-5-pruned.ckpt') , strict = False) model.load_state_dict(load_state_dict(PATH/TO/control_v11p_sd15_inpaint.pth) , strict = False)

   Hope this helps.

Lakshmanaraja commented 1 year ago

I am also training to train the inpainting controlnet. But until now, I haven't successfully achieved it. Maybe you need to first read the code in gradio_inpainting.py and you will get should use -1 to mask the nomalized image.

(I guess)

when sampling the image, we need to add mask to the process.

when we get the final outputs, we should also use the mask to process the generated images.

Your words helped me to crack this on time @tonyhuang2022 . I am not sure why it fails for you? Let me know if you still have problem. I can guide. Check my comment to @Hubert2102

George0726 commented 1 year ago

@Lakshmanaraja > Hi there, I want to fine-tune my ControlNet on my fine-tuned SD 1.5 inpainting models. However, I found that after 100 steps. My inpainting area became all black. Could you mind sharing some advice?

BTW, I modified the diffusers codes. The control image is the masked image with -1 on the mask area. It would be great if you can share you scripts.

Lakshmanaraja commented 1 year ago

@Lakshmanaraja > Hi there, I want to fine-tune my ControlNet on my fine-tuned SD 1.5 inpainting models. However, I found that after 100 steps. My inpainting area became all black. Could you mind sharing some advice?

BTW, I modified the diffusers codes. The control image is the masked image with -1 on the mask area. It would be great if you can share you scripts.

if the whole image become black , it means Vanishing Gradient issue. It may be due to GPU not compatible with Xformers ( ex : A6000 is not compatible. A5000 is compatible. I am sharing this by my personal experience in paperspace. But I dont have any specific list of GPU instance which is compatible with Xformers or not ). So check what server you are using.

If only masked area is black , It is good only. You need to train more.

Note : I didnt changes mask area to -1 in control image. I just made it white space. ie assigning the 255 in the numpy array. pil_array = np.array(Image.open( ).convert('RGB')) for y,row in enumerate(pil_array) : if yymax pil_array[y] = 255 else: for x,col in enumerate(row): if xxmaxt: pil_array[y,x] = 255

    *xmin,xmax,ymin,ymax represent the area to be masked. 

Note :  I am still using ControlNet1.0 training Code only with small changes i mentioned in my reply to @Hubert2102 but i use the in painting model from control net 1.1

TalhaUsuf commented 1 year ago

i am also facing this issue. I am doing this:

the training script is the one used by diffusers txt2image controlnet.

is the logic correct, normally in training phase, we start with the target image (that needs to be made) , add noise to it iteratively while the control net gets the conditional image. unet predicts noise that gets into loss function. Is this same in case of inpaint-controlnet train ?

YoucanBaby commented 1 year ago

i am also facing this issue. I am doing this:

the training script is the one used by diffusers txt2image controlnet.

is the logic correct, normally in training phase, we start with the target image (that needs to be made) , add noise to it iteratively while the control net gets the conditional image. unet predicts noise that gets into loss function. Is this same in case of inpaint-controlnet train ?

Hello, we are encountering the same problem now, have you solved this problem?

geroldmeisinger commented 1 year ago

I'm only making guesses here but some things that come to mind:

If you only use 3-channel RGB you have to 1. pick a color, 2. make the CN learn this area+color combination. If you use the same color everywhere, like black, then how well will it work in black images? If you use a random color everytime, it may be harder for the CN to learn. You could also shift the colors down 1 value and reserve black for inpainting only. Maybe it works better with random noise in the area but then you also have to use random noise in inference. Ultimately you probably have to run different experiments.
It should be possible to train CN using 4-channels and have the alpha channel use an unambiguous mask the same way we use it in inference. ...which gives me another idea: we could use liquid-resize technology to algorithmically re-fill the mask areas with RGB values based on the surrounding area which should give the CN additional guidance.
Even when the CN has converged you shouldn't be surprised to find other areas to change as well. After all that is what SD does: generate a new image every time. It's your job to cut-out only the mask area and insert into your original image.
Do some research, there are papers and scripts out already which make random masks (random plotting + ellipses + lines) and there are probably some evaluations on which patterns work better (I don't have any one at hand right now). StabilityAI might have published some info regarding their inpaint models. Please provide links if you find something!
In diffusers you can use --proportion_empty_prompts=0.5 which makes a lot of sense here as the very goal of inpainting is to guess unknown areas.
Once you trained the inpaint CN, how are you going to run it? Depending on which approach you take, you might have to adapt some code pipelines.
Take it from me, if this is your first CN training you should start with an easier one. One which is easier to prepare the condition images and where the outputs are unambiguous to evaluate. I wasted a lot of time evaluating failed CNs only to find my parameters were wrong. I wrote an article on CN training if you want to get started here: https://civitai.com/articles/2078 and you can find a report on how I trained a edge model here https://github.com/lllyasviel/ControlNet/discussions/318#discussioncomment-7090255
If you want a definitive answer you can always try to contact the original authors.
How can you guys afford A6000?

geroldmeisinger commented 1 year ago

also see https://github.com/huggingface/diffusers/issues/5406

geroldmeisinger commented 1 year ago

here's a baseline for all your inpaint models: https://github.com/lllyasviel/ControlNet/discussions/561

donggoing commented 1 year ago

Is it normal for your trained models to show discoloration or slight changes in areas that are not masked? How do you keep these areas exactly as they are?

Lakshmanaraja commented 10 months ago

Is it normal for your trained models to show discoloration or slight changes in areas that are not masked? How do you keep these areas exactly as they are?

True. It is quite challenging. Did you able to solve this?

hanpiness commented 7 months ago

When i retrain the inpaint CN, I get the bad result, did i have any mistakes? control: mask area pixel is zero. control_gs-099050_e-000005_b-023100 ground truth: reconstruction_gs-099050_e-000005_b-023100 model result: samples_cfg_scale_9 00_gs-099050_e-000005_b-023100

tiruss commented 7 months ago

The scheduler is a critical factor that contributes to stability. I recommend exploring the use of DDIM or DDPM as your scheduling approach.

HydrogenC commented 6 months ago

Take the masked image as control image, and have the model predicts the full or original unmasked image.

Doesn't the model also requires the mask itself to be inputted?

lllyasviel / ControlNet-v1-1-nightly

how did you train the inpaint controlnet model? #89