Algolzw / image-restoration-sde

Image Restoration with Mean-Reverting Stochastic Differential Equations, ICML 2023. Winning solution of the NTIRE 2023 Image Shadow Removal Challenge.
https://algolzw.github.io/ir-sde/index.html
MIT License
572 stars 42 forks source link

Bokeh model does not work #22

Closed hcl14 closed 1 year ago

hcl14 commented 1 year ago

I'm having a trouble making bokeh model work. The corresponding script for this model (/codes/config/latent_bokeh/test.py) doesn't seem to be correct - there are multiple places where it fails, can anybody confirm that it should work?

The following output was obtained using these parameters:

Gradio produces wrong picture with the config codes/config/latent_bokeh/options/bokeh/test/refusion.yml:

Screenshot 2023-06-13 at 0 05 11
Algolzw commented 1 year ago

Hi, in test config file, you should change the network settings to:

network_G:
  which_model: ConditionalNAFNet
  setting:
    img_channel: 4
    width: 64
    enc_blk_nums: [2, 2, 4, 8]
    middle_blk_num: 12
    dec_blk_nums: [2, 2, 2, 2]

Moreover, since the bokeh effect transform task is quit different with other tasks, you should also change some functions (with **kwargs) in the sde_utils.py as:


    def score_fn(self, x, t, **kwargs):
        # need to pre-set mu and score_model
        noise = self.model(x, self.mu, t, **kwargs)
        return self.get_score_from_noise(noise, t)

    def noise_fn(self, x, t, **kwargs):
        # need to pre-set mu and score_model
        return self.model(x, self.mu, t, **kwargs)

    def reverse_sde(self, xt, T=-1, save_states=False, save_dir='sde_state', **kwargs):
        T = self.sample_T if T < 0 else T

        x = xt.clone()
        for t in tqdm(reversed(range(1, T + 1))):
            score = self.score_fn(x, t, **kwargs)
            x = self.reverse_sde_step(x, score, t)
            # x = self.reverse_sde_step_mean(x, score, t)

            if save_states: # only consider to save 100 images
                interval = self.T // 100
                if t % interval == 0:
                    idx = t // interval
                    os.makedirs(save_dir, exist_ok=True)
                    x_L, x_R = x.chunk(2, dim=1)
                    tvutils.save_image(torch.cat([x_L, x_R], dim=3).data, f'{save_dir}/state_{idx}.png', normalize=False)

        return x

I have also updated the test.py (line 91-98) so that you can run our test script.

hcl14 commented 1 year ago

Sorry, the problem persists. I applied all your changes and tried the image from dataset:

metadata: 00000, Sony50mmf1.8BS, Sony50mmf16.0BS, 35

00000 src

app.py:

import gradio as gr
import cv2
import argparse
import sys
import numpy as np
import torch
from pathlib import Path

import options as option
from models import create_model
sys.path.insert(0, "../../")
import codes.utils as util

# options
parser = argparse.ArgumentParser()
parser.add_argument("-opt", type=str, default='/workspace/work/project2/configs/latent-debokeh/refusion.yml', help="Path to options YMAL file.")
opt = option.parse(parser.parse_args().opt, is_train=False)

opt = option.dict_to_nonedict(opt)

# load pretrained model by default
model = create_model(opt)
device = model.device

sde = util.IRSDE(max_sigma=opt["sde"]["max_sigma"], T=opt["sde"]["T"], schedule=opt["sde"]["schedule"], eps=opt["sde"]["eps"], device=device)
sde.set_model(model.model)

def deraining(image):
    image = image / 255.

    src_lens = torch.tensor(float(18))
    tgt_lens = torch.tensor(float(160))
    disparity = torch.tensor(float(35))

    image = torch.tensor(image).float().cuda()
    image = torch.permute(image, (2, 0, 1))

    latent_LQ, hidden = model.encode(torch.unsqueeze(image, 0))
    noisy_state = sde.noise_state(latent_LQ)

    model.feed_data(noisy_state, latent_LQ, src_lens=src_lens, tgt_lens=tgt_lens, disparity=disparity, GT=None)
    model.test(sde, hidden=hidden, save_states=False)
    visuals = model.get_current_visuals(need_GT=False)
    output = util.tensor2img(visuals["Output"].squeeze())
    return output

interface = gr.Interface(fn=deraining, inputs="image", outputs="image", title="Image Deraining using IR-SDE")
interface.launch(share=True)
Algolzw commented 1 year ago

Hi, can you also provide the refusion.yml file?

hcl14 commented 1 year ago
name: latent-reffusion-bokeh
suffix: ~  # add suffix to saved images
model: latent_denoising
distortion: bokeh
gpu_ids: [0]

sde:
  max_sigma: 50
  T: 100
  schedule: cosine # linear, cosine
  eps: 0.005

degradation:
  # for denoising
  sigma: 25
  noise_type: G # Gaussian noise: G

  # for super-resolution
  scale: 4

datasets:
  test1:
    name: Test
    mode: BokehLQ
    dataroot_LQ: /home/x_ziwlu/datasets/ntire2023/bokeh/ntire_val/src
    dataroot_meta: /home/x_ziwlu/datasets/ntire2023/bokeh/ntire_val/meta.txt

#### network structures
network_G:
  which_model: ConditionalNAFNet
  setting:
    img_channel: 4
    width: 64
    enc_blk_nums: [2, 2, 4, 8]
    middle_blk_num: 12
    dec_blk_nums: [2, 2, 2, 2]

network_L:
  which_model: UNet
  setting:
    in_ch: 3
    out_ch: 3
    ch: 64
    ch_mult: [1, 2, 4]
    embed_dim: 4

#### path
path:
  pretrain_model_G: pretrained_models/latent-reffusion-bokeh.pth
  pretrain_model_L: pretrained_models/latent-bokeh.pth
Algolzw commented 1 year ago

Hi, here is my results based on your provided image:

Screenshot 2023-06-18 at 20 28 42

I have also uploaded the app.py to this repo (actually I have already updated several files last time). You may want to download the updated code and run it again.

Best.

hcl14 commented 1 year ago

Thank you! I replicated the test result!

Then I tried it on my own images (which are outpaintings done by Stable Diffusion) and here I see that diffusion process seems not ending well (noise left) or the blurry parts of the image are untouched.

My test images: 0_blurred_man 0_RO9aplT 0_Young-Athletic-Black-Man-Standing-Outside-Ready-to-Exercise

Some results:

upscale_4x_t_100_sigma_50_cosine: 1 2 3

upscale_4x_t_200_sigma_100_cosine: 1 2 3

hcl14 commented 1 year ago

Perhaps you can give some advice on tuning the parameters? Thank you!

Algolzw commented 1 year ago

Hi, our Latent Reffusion always produces noisy images if the testing image is out of the distribution of the training dataset. To fine-tune the model, you can add some small noise to the latent when pretraining the latent-UNet model. And since all training images are synthetically generated, the model is hard to deal with other types of images (out of the distribution). This problem also happened in other CNN bokeh models, which you can find in the challenge report.

Algolzw commented 1 year ago

It would be better if you have your own dataset to retrain the model.

hcl14 commented 1 year ago

Okay, understood, thanks. I guess this issue can be closed then!