webcamoid / akvirtualcamera

akvirtualcamera, virtual camera for Mac and Windows
GNU General Public License v3.0
393 stars 51 forks source link

Overall low performance: as expected or not? #17

Closed rse closed 3 years ago

rse commented 3 years ago

I'm currently experimenting with your AkVirtualCamera under Windows 10 and have to say it looks like a very promising solution. But I'm facing performance issues and I'm not sure whether I should expect them or not. Let me describe my setup:

I've compiled all parts from source as of today (2021-05-30) and setup the devices with the following INI-file.

[Cameras]

cameras/size          = 2

cameras/1/description = Ak Virtual Camera 1
cameras/1/formats     = 1, 2, 3, 4

cameras/2/description = Ak Virtual Camera 2
cameras/2/formats     = 1, 2, 3, 4

[Formats]

formats/size          = 4

formats/1/format      = RGB24, YUY2
formats/1/width       = 480
formats/1/height      = 360
formats/1/fps         = 6, 12, 24, 30, 48, 60

formats/2/format      = RGB24, YUY2
formats/2/width       = 640
formats/2/height      = 480
formats/2/fps         = 6, 12, 24, 30, 48, 60

formats/3/format      = RGB24, YUY2
formats/3/width       = 1280
formats/3/height      = 720
formats/3/fps         = 6, 12, 24, 30, 48, 60

formats/4/format      = RGB24, YUY2
formats/4/width       = 1920
formats/4/height      = 1080
formats/4/fps         = 6, 12, 24, 30, 48, 60

Now just copying the frames from a Logitech BRIO 4K (in 720p30 mode) to AkVCamVideoDevice0 with the following command works as expected:

./ffmpeg.exe \
    -f dshow -rtbufsize 200M -s 1280x720 -r 30 -i video="BRIO 4K Stream Edition" \
    -f rawvideo -map 0:v -pix_fmt rgb24 -s 1280x720 - | \
./AkVCamManager.exe stream AkVCamVideoDevice0 \
    RGB24 1280 720

FFmpeg reports that it is able to drive 30fps and I see no real delay in the resulting video device's output. Great! But when I try to shuffle frames in a 1080p30 mode with the following command, FFmpeg is only able to achieve just 25 fps on average and after a few seconds (as expected as the input still comes with 30 fps) the FFmpeg "rtbufsize" is full and FFmpeg complains and the virtual camera device's output shows more and more delay:

./ffmpeg.exe \
    -f dshow -rtbufsize 200M -s 1920x1080 -r 30 -i video="BRIO 4K Stream Edition" \
    -f rawvideo -map 0:v -pix_fmt rgb24 -s 1920x1080 - | \
./AkVCamManager.exe stream AkVCamVideoDevice0 \
    RGB24 1920 1080

This is on an average gaming PC where OBS Studio and ManyCam and other video tools are running regularily and are able to shuffle around 1080p30 without problems. So, either FFmpeg here is the bottleneck or perhaps AkVCamManager stream itself. Do you have any ideas what can cause such a low overall performance? Do you have any ideas how I speed up the processing?

PS: The output of AkVCamManager.exe supported-formats -i shows just "RGB24" but I've nevertheless tried to feed YUY2 from FFmpeg to AkVCamManager. This way FFmpeg needed no conversion was able to process faster, but the result is that the resulting video works in tools like SplitCam but not in Microsoft Teams. I guess feeding YUY2 is actually not intended, even it looks it partly works?

hipersayanX commented 3 years ago

Overall low performance: as expected or not?

For the results you are posting, I would say that is better than expected :laughing:

FFmpeg reports that it is able to drive 30fps and I see no real delay in the resulting video device's output. Great! But when I try to shuffle frames in a 1080p30 mode with the following command, FFmpeg is only able to achieve just 25 fps on average and after a few seconds (as expected as the input still comes with 30 fps) the FFmpeg "rtbufsize" is full and FFmpeg complains and the virtual camera device's output shows more and more delay

It's normal, the bigger the frame size, the higher the delay you will have.

Do you have any ideas what can cause such a low overall performance?

Yes, all the frames format conversions and scaling is by far the biggest bottleneck.

Do you have any ideas how I speed up the processing?

Yes, all the frame processing could be improved parallelizing the conversion, maybe using OpenMP, maybe using the CPU intrinsic functions, maybe passing the task to the GPU, maybe using OpenCL or CUDA, there are a lot of options. For now, I am satisfied that it works at least, and will try to improve it much later.

I guess feeding YUY2 is actually not intended, even it looks it partly works?

I'll add support for other input formats later.

rse commented 3 years ago

Ok, I've investigated deeper: it was entirely my fault in using a too problematic setup! As I'm a Unix veteran I was used to write down the test pipelines as the mentioned Unix shell scripts, as even under Windows 10 I'm always working inside its WSL. But: the two involved programs in the pipeline are ffmpeg.exe and AkVCamManager.exe and both are regular Windows executables, while the stdio interconnect is from the Unix shell! As a consequence, the video stream had to travel from the Windows world of ffmpeg.exe to the WSL world (for the stdio pipe) and back again to the Windows world of AkVCamManager.exe. And this doubled transfer between the worlds was what effectively limited the overall throughput! Once I executed the pipleline directly inside a Windows .bat file it achieves 60fps:

ffmpeg.exe -hide_banner -f dshow -rtbufsize 240M -s 1920x1080 -framerate 60 -i "video=BRIO 4K Stre
am Edition" -f rawvideo -map 0:v -pix_fmt rgb24 -s 1920x1080 - | AkVCamManager.exe stream AkVCamVideoDevice0 RGB24 1920 1080

So, sorry, there is neither is performance problem within FFmpeg nor AkVirtualCamera. It was the problematic interconnect!

hipersayanX commented 3 years ago

Oh ok, I didn't tested it in WSL, I usually use MSYS as my playground.