Lin-Sinorodin / SwinIR_wrapper

Wrapper for "SwinIR: Image Restoration Using Swin Transformer", for easy usage as a package.
9 stars 5 forks source link

[REQ] HINet wrapper #1

Closed MarcoRavich closed 2 years ago

MarcoRavich commented 2 years ago

Hi there, 1st of all thanks for your work, your wrapper is very clean and usable.

If you're interested can you build something similar for HINet: Half Instance Normalization Network for Image Restoration ?

With the help of HIN Block, HINet surpasses the state-of-the-art (SOTA) on various image restoration tasks. For image denoising, we exceed it 0.11dB and 0.28 dB in PSNR on SIDD dataset, with only 7.5% and 30% of its multiplier-accumulator operations (MACs), 6.8 times and 2.9 times speedup respectively. For image deblurring, we get comparable performance with 22.5% of its MACs and 3.3 times speedup on REDS and GoPro datasets. For image deraining, we exceed it by 0.3 dB in PSNR on the average result of multiple datasets with 1.4 times speedup.

Some "real-world" tests by Selur:

Mode: Deblur GoPro

Mode: Deblur REDS

Mode: denoise

Mode: derain

Hope that inspires !

Lin-Sinorodin commented 2 years ago

Thanks for the comment, I'm glad you found it useful!

Although it looks very interesting, I won't have the time to work on this in the near future...

If you are willing to try it out yourself, it seems from their code that it is pretty straightforward. If you look at the code on this page, It seems that something like this can be a pretty clean and short wrapper:

import cv2
import torch

from basicsr.models import create_model
from basicsr.train import parse_options
from basicsr.utils import FileClient, imfrombytes, img2tensor, padding

# define model
opt = parse_options(is_train=False)
model = create_model(opt)

img_lq = cv2.imread(path, cv2.IMREAD_COLOR)
img_lq = cv2.cvtColor(img_lq, cv2.COLOR_BGR2RGB)

img_lq_tensor = torch.from_numpy(img_lq.transpose(2, 0, 1)).float()

# get high res img
img_hq: np.array = model.my_single_image_inference(img_lq_tensor , output_path)

The main difference is adding a function similar to the inference function that return the image instead of saving it:


def my_single_image_inference (self, img) -> np.array:
    self.feed_data(data={'lq': img.unsqueeze(dim=0)})

    if self.opt['val'].get('grids', False):
        self.grids()

    self.test()

    if self.opt['val'].get('grids', False):
        self.grids_inverse()

    visuals = self.get_current_visuals()
    sr_img = tensor2img([visuals['result']])

    # REPLACE THIS
    # imwrite(sr_img, save_path)

   # WITH THIS
   return sr_img

It's just in a glance, and I probably missed something but the main idea is here. This way it's should be also available to merge this into a more high-level API, that supports both SwinIR and the paper you referenced. Seems like a cool project, and I'll be happy to contribute.

Hope it's helpful, Lin