Asd-g / avs-mlrt

ML Runtimes for AviSynth+.
GNU General Public License v3.0
19 stars 0 forks source link

Weird pre-resize output and other architectures. #7

Closed Kuronoe-Ookami closed 7 months ago

Kuronoe-Ookami commented 11 months ago

I found something strange and wanted to ask if I did something wrong or it's a problem. When I use any pre-resize (nnedi3) in this test, the image loses resolution in the corners. After some tests I realized that it only happens with widths greater than 1024, regardless of height (tested with 576p to 720p). What can it be?

script: https://katb.in/yiqoqupisuj

Second point: Is it possible to add support for other architectures, such as ESRGAN-LITE or Omni? I don't know why Ersgan didn't work, since it is the most basic of architectures in use. Currently compact works very well.

I left the github links of the model authors in the script. both are very friendly and willing.

I appreciate any help or tips.

Asd-g commented 11 months ago
FFVideoSource() # 1920x1080
nnedi3_rpow2(rfactor=2, cshift="Spline64Resize", fwidth=1280, fheight=720, ep0=8, nsize=2, nns=3, qual=2, pscrn=3, threads=24, opt=2) # pre-resize
ConvertBits(32)
ConvertToPlanarRGB()
mlrt_ncnn(network_path=".\models-1\models\RealESRGANv2\animejanaiV2L3.onnx", fp16=true, num_streams=3, builtin=false)
Spline64Resize(1920,1080)

https://i.slow.pics/8ZIX83BV.png https://i.slow.pics/x7cfr8Fk.png

I don't see issues if I correctly understand the used test script.

Where are the .onnx ESRGAN models that doesn't work?

Kuronoe-Ookami commented 11 months ago

I've never tested it with real content, I can't say if it's specific to anime. I made a mistake there and sent an image in 1080p, the source input is 1024x576, and it always happens that the resolution looks strange on the curves (like on the character's face) when I use anything larger, like 1280 wide (first image I posted before).

The idea of my script was to just take a video in 1024x576p and pass it through nnedi3 delivering a better video to the model (1280x720p).

image in original resolution:

The esrgan lite model that doesn't work on mlrt: https://openmodeldb.info/models/2x-Garfield1-308k I put the versions I use here: dropbox

I was trying to rule out possible problems, I decided to test it in VSGAN and don't see this problem in vapousynth's VSGAN using the model in .pth. https://github.com/rlaphoenix/VSGAN https://vsgan.phoeniix.dev/en/stable/ So I don't know what it could be.

Anyway, thanks for helping.

Asd-g commented 11 months ago

Thanks for the models.

FFVideoSource("C:\Users\xyz\Downloads\275372804-1d376fdf-814c-4f49-be0a-a30db2c53026.png")
ConvertToPlanarRGB()
z_ConvertFormat(pixel_type="yv12", colorspace_op="rgb:709:709:f=>709:709:709:l", resample_filter_uv="spline36", dither_type="none", use_props=0)
nnedi3_rpow2(rfactor=2, cshift="Spline64Resize", fwidth=1280, fheight=720, ep0=8, nsize=2, nns=3, qual=2, pscrn=3, threads=24, opt=2) # pre-resize
ConvertBits(32)
ConvertToPlanarRGB()
mlrt_ncnn(network_path=".\model\2x_AnimeJaNai_V3_SmoothRC21_Compact_50k_fp32.onnx", fp16=true, num_streams=3, builtin=false)
Spline64Resize(1920,1080) 
Subtitle("with pre-resize")
#Subtitle("without pre-resize")

This is the output from the above script - https://imgbox.com/g/gJd5GhWOEj

2x_Garfield1_308k.onnx model you shared works for me (mlrt_ncnn/ov).

2x_AniScale2_ESRGAN_Lite_i16_165K.onnx you shared works for me with mlrt_ov but it crashes with mlrt_ncnn. I have to look at it these days.

WolframRhodium commented 11 months ago

ncnn does not support onnx's Shape operator, which is used in 2x_AniScale2_ESRGAN_Lite_i16_165K.onnx. The onnx should be re-exported (from PyTorch) with this operation removed/converted.

Asd-g commented 11 months ago

@WolframRhodium, thanks for the input.

Kuronoe-Ookami commented 11 months ago

I don't know what could be causing this resolution problem here. I tested several things, updated my avisynth+ to the latest update recommended here. I thought it could be the vram (3060 Ti 8GiB), but the usage is below 50%, the ram is free too (16Gib), using the same script as you posted above. A friend tested it on the 2060S GPU and has the same problem.

I managed to enter a resize spline/nnedi3 1280x720, but I had to adjust the tilesize_w, it worked normally, I'm just not sure if the pre-resize has any function now or if this is very wrong, I didn't find information about tiles. Script test: https://katb.in/xayomagubop

. Only for testing purposes, vsgan in .pth deliver almost 4fps more at the same 1024x576p resolution, I can make any input up to 2k without crashing. I think it's difficult for the problem to be vram or other hardware. I would even use vsgan, but that vs-preview drives me crazy while avsp-gispos is very functional, and vs doesn't have other filters that I use for more specific cases.

WolframRhodium I informed what you said to one of the guys who makes models, he wasn't sure if it would work and decided to test it.

Anyway, thank you for everything. I will continue testing other options.

In uso: AviSynth+ r3928 | all latest filters, CUDA and gpu drive updated | PC i7 13700KF, GPU 3060 Ti 8GiB, RAM 16GiB.

Asd-g commented 11 months ago

Here is the description of tilesize_w/h.

1024x576 src
Spline64Resize(1280, 720) #test
ConvertBits(32)
ConvertToPlanarRGB()
mlrt_ncnn(network_path="C:\Users\xyz\Downloads\model\2x_AnimeJaNai_V3_SmoothRC21_Compact_50k_fp32.onnx", fp16=true, num_streams=4, builtin=false)
Convertbits(8)
ConverttoYV12()
deep_resize(width=1920, height=1080, flat="Zopti1080")

If I try the above script I get error from deep_resize. If I comment/remove Spline64Resize(1280, 720) #test, the script is loaded without issues.

I'm using the latest deep_resize and all it's dependencies are from the master branch of that repository.

I also use the latest version of AviSynth+.

We have different filter chains.

Some more of tests:

If you use tilesize_w/h smaller than the clip width/height, the impact of the pre-resizer is still there. You can consider to use overlap_w/h too.

vsgan is using CUDA (NVIDIA only) while mlrt_ncnn is using Vulkan (NVIDIA, AMD, Intel). It's expected that the speed differ and 99% of the cases the CUDA implementations are faster than the Vulkan one. I'm planning to port vsort that will give better speed for the owners of NVIDIA GPU.

kedaitinh12 commented 11 months ago

I think the Cuda ver will faster Vulkan but you don't have Cuda GPU to make it, @Asd-g

Asd-g commented 11 months ago

vsort has support for CPU too. So I can test it on CPU and just implement the GPU support too.

Kuronoe-Ookami commented 11 months ago

You're probably right about the filter chain, I had a hard time getting these filter packs to work.

I made a short video with the many resize variations: catbox I also added a small cut from the video source (40 seconds): mediafire

It's weird, I've tried several things, even removing all the filters from the folder and leaving the minimum, but it doesn't have any effect. It's not something that gets in the way, it just limits it, so I'll have to deal with it like I used to.

In any case, sorry for taking up your time with this thing and, thank you for your help.

Asd-g commented 10 months ago

Thanks for the clips.

I cannot reproduce the issue - https://imgbox.com/g/pmVdbY0Ler

Last ideas:

WolframRhodium commented 10 months ago

vsort also has support for DirectML which is available on AMD GPUs. It is faster than ncnn_vk in some benchmarks.

Kuronoe-Ookami commented 10 months ago

I think we've reached the end of testing. I only have the gpu driver downloaded by nvidia experience. I had already looked for something that was getting in the way, I even removed all the filters from the avisynth and left only the necessary ones. I had already tested these parameters before, I generally use dgi and ffms2, it's not them, and tested with LSmash and it had the same problem. the conclusion was that it is related to width, I can use any resolution broken before (720x1280; 1024x1080; 1024x2080), as long as it doesn't exceed 1024 in width, it doesn't happen. For now, adjusting the tiles has resolved the issue.

The solution is to wait, eventually it will be necessary to format the computer to add something, so I will test everything from scratch. :D

Thank you very much for your time and for so many tests.

Asd-g commented 10 months ago

@WolframRhodium, thanks for the info.

@Kuronoe-Ookami, I don't know if you tried fp16=false but I got images that show issue as yours - https://imgbox.com/g/41Dc5ozxe9

Kuronoe-Ookami commented 10 months ago

Hi, I tested it in the trim I sent above, strangely it solved the problem like the idea with the tiles. I remember there was an occasion where I had to activate it, but the reason is forgotten. In your images the problem almost goes unnoticed (I see very little in the fence), it tends to appear more in elongated curves. I wonder what changes in fp16 and if it's related to models in fp16 and fp32, because models in fp16 didn't work here when I tested, I even had to ask Janai's guy for a save in fp32.

Still, it's great news. Thanks for the feedback.

Asd-g commented 7 months ago

Added mlrt_ort.

kedaitinh12 commented 7 months ago

Added mlrt_ort.

Thanks