williamyang1991 / VToonify

[SIGGRAPH Asia 2022] VToonify: Controllable High-Resolution Portrait Video Style Transfer
Other
3.53k stars 442 forks source link

Input live webcam video for live streaming? #40

Open vallamost opened 1 year ago

vallamost commented 1 year ago

Has anyone tried or thought about the possibility to use VToonify with live image inputs from a webcam or virtual camera? Or utilizing it for livestreams?

Someone mentioned being able to use this on Linux - https://github.com/umlaeute/v4l2loopback

joshdance commented 1 year ago

Is VTonnify fast enough to support live processing? I haven't used it yet, researching, but I thought you had to process videos.

williamyang1991 commented 1 year ago

Since VToonify is based on a big model of StyleGAN/DualStyleGAN, it is currently not real time. We test on generating 1600×1280 videos, the ideal running time excluding video reading/writing on a NVIDIA Tesla V100 GPU is about 0.2s per frame. With video reading/writing and other processes for live stream, the time for each frame will be even greater.

ileocho commented 1 year ago

In your opinion, could it potentially run on lower-resolution videos ? Or is this model specific for high-res ?

williamyang1991 commented 1 year ago

For low-resolution, I think it is possible to use stylegan-256 rather than stylegan-1024, and train the corresponding dualstylegan and vtoonify, which needs some code modifications I suppose.