[Question] onnx optimization for vulkan already implemented?

AmusementClub / vs-mlrt

Efficient CPU/GPU ML Runtimes for VapourSynth (with built-in support for waifu2x, DPIR, RealESRGANv2/v3, Real-CUGAN, RIFE, SCUNet and more!)

GNU General Public License v3.0

268 stars 18 forks source link

[Question] onnx optimization for vulkan already implemented? #93

Closed Bercraft closed 3 months ago

Bercraft commented 3 months ago

Hello i was using the vulkan support on windows with a radeon pro w5700, but saw really slow speed when using Animejanai. I have read that you convert onnx models to ncnn, i was wondering if you were aware of this conversion optimizer for vulkan.

optimizer https://github.com/daquexian/onnx-simplifier main conversion page https://github.com/Tencent/ncnn/wiki/use-ncnn-with-pytorch-or-onnx

thank you for your work.

WolframRhodium commented 3 months ago

Hi, onnx-simplifier will not improve performance of Animejanai models. What's your current performance?

Bercraft commented 3 months ago

Sorry after digging in the code i found why.

filter_output = RealESRGANv2( filter_input, model=RealESRGANv2Model.animejanaiV3_HD_L1, backend=Backend.NCNN_VK( fp16 = True,
device_id = 0,
num_streams = 4,
), )

after adding fp16 line and streams line i got 15 fps at roughly 4 gb of vram, but i am using the model not to upscale only as a refiner (it works like anime 4k when restoring but seams better). Maybe a better description would help when using the vulkan backend for less guru people??

I am willing to test for you vulkan build if you need.

WolframRhodium commented 3 months ago

You may try to use the ORT_DML backend. Note that w5700 does not have high fp16 performance due to the lack of specialized hardware units.

Bercraft commented 3 months ago

Its even worse with DirectML. i will wait till better vulkan implementation. Thank you.