the-database / mpv-upscale-2x_animejanai

Real-time anime upscaling to 4k in mpv with Real-ESRGAN compact models
Other
287 stars 6 forks source link

I will share with the developer how I caught the video lagging phenomenon #8

Open foxbox93 opened 1 year ago

foxbox93 commented 1 year ago

First of all, thank you so much for making such a great program. This time, I had difficulty introducing 2x_animejanai, including mpv. However, I found two points, corrected them, and I think I overcame them.

Computer used: 13600k & Nvida RTX 4070ti My coding knowledge: convergence to zero English proficiency: Poor English skill, but Google Translate handles it

I tried using Standard_V1_Ultra compact for the first time, but there was a huge roar and a drop frame phenomenon.

After several trials and errors, I overcame it by modifying the 2x_SharpLines.vpy value.

1. core.num_theads = 14

I saw in the task manager that my computer had a CPU 14 core and changed the data value. I think there will be a lot of improvements if many people change it according to their computer performance.

2. I noticed that there was a huge overload in processing upscaling twice.

To summarize, I added "SHD_ENGINE" as shown in the example below. If I put the COMPACT engine twice, it seemed quite burdensome for the computer, so I lowered the engine one step at a time and applied it.

The result was very satisfying and I would be very happy if it would help the developer.

Result cpu 99% > 20 ~ 40 % gpu 99% > 60 ~ 80 %


import vapoursynth as vs import os

SD_ENGINE_NAME = "2x_AnimeJaNai_Strong_V1_Compact_net_g_120000" HD_ENGINE_NAME = "2x_AnimeJaNai_Strong_V1_UltraCompact_net_g_100000" SHD_ENGINE_NAME = "2x_AnimeJaNai_Strong_V1_SuperUltraCompact_net_g_100000"

core = vs.core core.num_threads = 14 # can influence ram usage colorspace="709"

def scaleTo1080(clip, w=1920, h=1080): if clip.width / clip.height > 16 / 9: prescalewidth = w prescaleheight = w clip.height / clip.width else: prescalewidth = h clip.width / clip.height prescaleheight = h return vs.core.resize.Bicubic(clip, width=prescalewidth, height=prescaleheight)

def upscale2x(clip): engine_name = SD_ENGINE_NAME if clip.height < 720 else HD_ENGINE_NAME return core.trt.Model( clip, engine_path=f"C:\Program Files (x86)\mpv-lazy-20230404-vsCuda\mpv-lazy\vapoursynth64\plugins\vsmlrt-cuda\{engine_name}.engine", num_streams=4, )

def upscale4x(clip): engine_name = HD_ENGINE_NAME if clip.height < 720 else SHD_ENGINE_NAME return core.trt.Model( clip, engine_path=f"C:\Program Files (x86)\mpv-lazy-20230404-vsCuda\mpv-lazy\vapoursynth64\plugins\vsmlrt-cuda\{engine_name}.engine", num_streams=4, )

clip = video_in

if clip.height < 720: colorspace = "170m"

clip = vs.core.resize.Bicubic(clip, format=vs.RGBS, matrix_in_s=colorspace,

width=clip.width/2.25,height=clip.height/2.25 # pre-downscale

)

pre-scale 720p or higher to 1080

if clip.height >= 720 or clip.width >= 1280: clip = scaleTo1080(clip)

upscale 2x

clip = upscale2x(clip)

upscale 2x again if necessary

if clip.height < 2160 and clip.width < 3840:

downscale down to 1080 if first 2x went over 1080

if clip.height > 1080 or clip.width > 1920: clip = scaleTo1080(clip)

upscale 2x again << CHANGE IT>>

clip = upscale4x(clip)

clip = vs.core.resize.Bicubic(clip, format=vs.YUV420P8, matrix_s=colorspace)

clip.set_output()

dvize commented 1 year ago

I can confirm that I am seeing a drop in usage using these settings (makes sense since you are using a more performant engine at the 4x scale). Setting the core count to 12 for me (5900x cpu) made it open faster during the initial buffering. I think with the 4090, i have enough buffer to add vs rife with maybe additional buffer.

see here:

https://cdn.discordapp.com/attachments/290974517524824064/1095541333877342258/image.png

the-database commented 1 year ago

Thanks for sharing this. With V2 I have been working on making everything more easily configurable, but this demonstrates a use case I didn't consider. I'll see if I can work this into V2.

hooke007 commented 1 year ago

I saw in the task manager that my computer had a CPU 14 core and changed the data value. I think there will be a lot of improvements if many people change it according to their computer performance.

Actually you were reducing instead of increasing the threads vs would use. num_threads default to num of CPU's threads rather than cores. Theoretically you could make it lower to reduce more usage of V/RAM. Also, mpv links vs in a hacky way, changing the value of buffered-frames or concurrent-frames (ref https://mpv.io/manual/master/#video-filters-vapoursynth ) will also improve (or make it worse). Considering the complex usage of vs, it's difficult to give the reasonable values for users. We have to tested by ourselves to find the best value for each plugin/script/video/device.