v4.14-lite is slower than v4.14

TNTwise / rife-ncnn-vulkan

RIFE, Real-Time Intermediate Flow Estimation for Video Frame Interpolation implemented with ncnn library

MIT License

29 stars 3 forks source link

v4.14-lite is slower than v4.14 #3

Closed vYLQs6 closed 6 months ago

vYLQs6 commented 6 months ago

Hi, sorry for bothering you but it's kinda weird that the Lite version of 4.14 model is slower than the normal one.

I am using a RTX 4060, and all other Lite models are quicker than the normal models, except 4.14-lite

Is this a bug? Or just how things should work?

Thank you

KaFu74 commented 6 months ago

I think it's related to the model itself, here's an ongoing discussion on that in the SVP forum: https://www.svp-team.com/forum/viewtopic.php?pid=83825#p83825

Rife 4.14 "lite" is different to the other "lite" models. It uses a technology that requires more GPU bus bandwidth but not as much compute power. Rife 4.13 "lite" is similar to regular Rife 4.9 so if you struggle with 4.9 you will struggle with 4.13. I think maybe Rife 4.14 needs more compute power and memory bandwidth than the 3060ti can provide.

TNTwise commented 6 months ago

From what I have observed, you can sometimes bump up the threading higher on the lite models while still improving performance. so try -j 2:4:4 for the lite model. Other than that, I just convert the models, and can't alter the speed. This could be because of the usage of convolutiondepthwise instead of just convolution within the lite model. This is not a bug, It is the exact same as the onnx model.