a small suggestion for optimizing the running speed of iw3 - Githubissues

nagadomi / nunif

Misc; latest version of waifu2x; 2D video to stereo 3D video conversion

MIT License

1.58k stars 142 forks source link

a small suggestion for optimizing the running speed of iw3 #76

Closed wududu123 closed 8 months ago

wududu123 commented 9 months ago

I run iw3 with --depth-model Any_S, but the speed is 9.07it/s when I convert 1080P video on rtx3090。 the inference time of Any_S is about 15ms. So I'm guessing it takes a lot of time to render sbs。 A few years ago, I implemented a 2D to 3D conversion based on this project 【3d-ken-burns】. It uses a completely different rendering method. Can we use it to calculate the grid and use it for grid_sample? It render by cuda, very very fast!!! I reimplemented it on ios using metal。

nagadomi commented 8 months ago

You can use --method grid_sample option to use the simple grid_sample method on GPU (Method -> grid_sample on GUI). However, it will cause ghost artifacts at the foreground and background edges. The default row_flow model is lightweight(only 0.016M parameters) and is already running on GPU. Of course, it is slower than grid_sample.

Also, the benchmark time of Any_S is model.forward() only, but iw3 is much slower because it includes a lot of processing such as the video decoding/encoding and resize and concat of the image on original resolution.

nagadomi commented 8 months ago

Point cloud rendering mentioned in the above project may be one of the better ideas. However, I am not yet familiar with that area.

Regarding processing speed, I will try to profiling later.

nagadomi commented 8 months ago

The reason the current row_flow model is slow is because it runs on the original image resolution (1920x1080). The output depth image of the depth model is 392 or 518, so it may be possible to run it on a lower resolution.

wududu123 commented 8 months ago

I think we should talk about increasing the rendering resolution to 4K instead of lowering it。 I am learning python. I guess a lot of variables can be initialized once.

nagadomi commented 8 months ago

Currently depth estimation is performed on a lower resolution, and the result is resized to the original resolution. So calculating warp grid on a lower resolution and then resizing it, I guess that the quality will not change much (I haven't tested this yet).

nagadomi commented 8 months ago

I have made the above changes. With high resolution input(1080p), FPS has improved about 2x.

wududu123 commented 8 months ago

Awesome!!! I tested it and the fps reached 25it/s。