lllyasviel / ControlNet

Let us control diffusion models!
Apache License 2.0
30.18k stars 2.72k forks source link

Much Worse test result when using gradio_canny2image.py than validation result. #417

Open dedoogong opened 1 year ago

dedoogong commented 1 year ago

Hi Thanks for your great work. I've trained Controlnet using Canny Edge. After 30 epoch, the validation images in image_log show quite realistic good results. So I used the model and run gradio_canny2image.py and tried with a lot of different parameter settings like CFG, but it shows always so bad result compared to the validation results. Is there any difference between validation process or parameter settings and web version?

Please help me~! Thank you!

HassanBinHaroon commented 1 year ago

@dedoogong Have you figured out the reason? If yes, kindly elaborate.

ELEPOT commented 1 year ago

I have the same problem. I trained a controlnet model with my own dataset, and the results are much worse when I load the model in automatic1111's webui than the results in image_log.

dedoogong commented 1 year ago

Yes I still have the problem. I wonder, what is difference between evaluation process(run on the every 300 iter) or paramter and independent inference(gradio_canny.py) process / parameter settings.

ELEPOT commented 1 year ago

image After digging through some code I found that in the evaluation process, the CFG scale is 9, the sampling step is 50, and the sampler seems to be ddim But after I applyed these changes the problem is still not fixed. Hope it would help in some way.

Edwardlmaooooooo commented 11 months ago

Hi Thanks for your great work. I've trained Controlnet using Canny Edge. After 30 epoch, the validation images in image_log show quite realistic good results. So I used the model and run gradio_canny2image.py and tried with a lot of different parameter settings like CFG, but it shows always so bad result compared to the validation results. Is there any difference between validation process or parameter settings and web version?

Please help me~! Thank you!

Did you upload a Canny image when testing?

kako523 commented 10 months ago

Have you solved the problem? I had the same problem and didn't know how to solve it. If you know the solution, please let me know, thank you very much!

Namn23 commented 9 months ago

Hi I have the same problem,too. Have you figured out the reason? If yes, let me know please.I would appreciate it very much!

SummerWRain commented 7 months ago

This problem has been bothering me for a long time, but after some effort, I wrote a inference script, and now it works relatively normally, hope it can help you.

I figured it might be because the gradio_canny2image.py test code is different from the log_images in the ControlLDM class for training. I'm still not sure why, but I'd like someone to explain what's going on. My work doesn't require a text prompt, so the inference script has no text input.

My code level is limited, if anyone can optimize it again that would be great!

from share import *

from cldm.model import create_model, load_state_dict
import cv2
from annotator.util import resize_image
import numpy as np
import torch
import einops
from cldm.ddim_hacked import DDIMSampler
from PIL import Image

# Configs
resume_path = '/ControlNet/lightning_logs/version_6/checkpoints/last.ckpt' # your checkpoint path
N = 1
ddim_steps = 50

model = create_model('./models/cldm_v21.yaml').cpu()
model.load_state_dict(load_state_dict(resume_path, location='cuda'))
model = model.cuda()
ddim_sampler = DDIMSampler(model)

img_path = 'your image path'
img = cv2.imread(img_path)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
img = resize_image(img, 512)

control = torch.from_numpy(img.copy()).float().cuda() / 255.0
control = torch.stack([control for _ in range(N)], dim=0)
control = einops.rearrange(control, 'b h w c -> b c h w').clone()
c_cat = control.cuda()
c = model.get_unconditional_conditioning(N)
uc_cross = model.get_unconditional_conditioning(N)
uc_cat = c_cat
uc_full = {"c_concat": [uc_cat], "c_crossattn": [uc_cross]}
cond={"c_concat": [c_cat], "c_crossattn": [c]}
b, c, h, w = cond["c_concat"][0].shape
shape = (4, h // 8, w // 8)

samples, intermediates = ddim_sampler.sample(ddim_steps, N, 
                                             shape, cond, verbose=False, eta=0.0, 
                                             unconditional_guidance_scale=9.0,
                                             unconditional_conditioning=uc_full
                                             )
x_samples = model.decode_first_stage(samples)
x_samples = x_samples.squeeze(0)
x_samples = (x_samples + 1.0) / 2.0
x_samples = x_samples.transpose(0, 1).transpose(1, 2)
x_samples = x_samples.cpu().numpy()
x_samples = (x_samples * 255).astype(np.uint8)

image_name = img_path.split('/')[-1]
Image.fromarray(x_samples).save('./outputs/' + image_name)