Closed Ironclad17 closed 1 year ago
That example was meant for RIFE. For super-resolution filters, you need to use these:
th = (src.height + 31) // 32 * 32 # adjust 32 and 31 to match specific AI network input resolution requirements.
tw = (src.width + 31) // 32 * 32 # same.
padded = src.resize.Bicubic(tw, th, format=vs.RGBS if WANT_FP32 else vs.RGBH, matrix_in_s="709", src_width=tw, src_height=th)
flt = vsmlrt.RIFE(src, model=RIFEModel.v4_6, backend=backend, output_format=1) # fp16 output
oh = src.height * (flt.height // th) # not necessary for RIFE (i.e. oh = src.height), but required for super-resolution upscalers.
ow = src.width * (flt.width // tw)
res = flt.resize.Bicubic(ow, oh, format=vs.YUV420P8, matrix_s="709", src_width=ow, src_height=oh)
Release note updated.
I believe FP16 io is only useful if your application satisfies one or (ideally) more of these conditions:
num_streams
), the computation should be able to almost fully hide the PCIe transfer time.
I just wanted to make sure no one else had this issue, but the example code in release v13 caused me some problems. First, the input for RIFE is src while the previous lines output is padded. Second, the output for RIFE is flt but src_width & src_height are equal to src.width & src.height which is fine for RIFE probably, but when using ESRGAN or other upscalers leads to a cropped image. You want them equal to flt.width & flt.height I think. I don't really understand why the resize lines have 3 inputs. For ESRGAN would this provide similar savings on memory bandwidth?
tw = (src.width + 31) // 32 * 32
th = (src.height + 31) // 32 * 32
padded = core.resize.Point(src, format=vs.RGBH, matrix_in=1, src_width=tw, src_height=th)
flt = RealESRGAN(padded, model=RealESRGANModel.animevideov3, backend=Backend.TRT(fp16=True, device_id=0, num_streams=2))
res = core.resize.Point(flt, format=vs.YUV420P8, matrix=1, src_width=flt.width, src_height=flt.height)