xinntao / Real-ESRGAN

Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration.
BSD 3-Clause "New" or "Revised" License
27.88k stars 3.51k forks source link

Add support for batched inference of images of same size #814

Open aliencaocao opened 3 months ago

aliencaocao commented 3 months ago

Allow user to pass in a list of np arrays to enhance method like this:

from basicsr.archs.rrdbnet_arch import RRDBNet
from realesrgan import RealESRGANer
from realesrgan.archs.srvgg_arch import SRVGGNetCompact

model = SRVGGNetCompact(num_in_ch=3, num_out_ch=3, num_feat=64, num_conv=32, upscale=4, act_type='prelu')  # https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.5.0/realesr-general-x4v3.pth
netscale = 4

upsampler = RealESRGANer(
    scale=netscale,
    model_path='realesr-general-x4v3.pth',
    model=model,
    half=True)

from PIL import Image
import numpy as np
img = Image.open('img1.jpg')
img2 = Image.open('img2.jpg')
img = np.asarray(img)
img2 = np.asarray(img2)
assert img.shape == img2.shape
scale = 4
output, _ = upsampler.enhance([img, img], outscale=scale)
print(output)

Note due to constraints of the model and its padding usage, only images of same height and width (shape) is supported. You can resize it to same before hand.

This can come useful in fast video inference where all frames are of same size.

Fixes https://github.com/xinntao/Real-ESRGAN/issues/634

NicholasCao commented 1 month ago

I tried this batch inference, but after testing, it turned out to be slower than inferencing a single image. Is there something wrong? @aliencaocao

aliencaocao commented 1 month ago

It could be cpu bound because the implementation is essentially a lot of for loops for np arrays. It is surely not as optimized as it could be. The perf improvement is more for the case where you have a very large batch size like 100+, then the time taken to launch GPU kernels will override the extra overhead by the for loops