haasn / libplacebo

Official mirror of libplacebo
http://libplacebo.org/
GNU Lesser General Public License v2.1
568 stars 72 forks source link

Possible to transfer rendered frame to AVFrame? #279

Closed spedagadi closed 4 months ago

spedagadi commented 4 months ago

Checking if there is a utility function in libplacebo to get hold of the displayed frame after rendering and convert to ffmpeg's AVFrame? My use case is to have a HW encoder in ffmpeg downstream to renderer. I understand glReadPixels may be able to do it but was wondering if this call will efficient

haasn commented 4 months ago

Does the pl_swapchain_frame.fbo have fbo->params.host_readable set? If so, you can just pl_download_avframe on it, before calling pl_swapchain_submit_frame to present it to the display.

Otherwise, you will need to create a second texture, render to that (instead of the fbo), and then pl_tex_blit() to the swapchain fbo before presenting.

spedagadi commented 4 months ago

I am using glfw-d3d11 and fbo->params.host_readable is set to false and as for the alternate solution, it may look like this? https://github.com/mpv-player/mpv/blob/e509ec0aaffce74e520702e16e3e21ea0f168940/video/out/vo_gpu_next.c#L1369 Can the secondary fbo created once and reused as in the code above, it looks like being created on the fly and destroyed after screenshot completion?

Also wondering if it is possible during window creation fbo->params.host_readable can be forced to true (interested in cross platform, not only glfw-d3d11) so that extra steps of rendering to secondary texture and pl_tex_blit() could be avoided?

haasn commented 4 months ago

I am using glfw-d3d11 and fbo->params.host_readable is set to false and as for the alternate solution, it may look like this? https://github.com/mpv-player/mpv/blob/e509ec0aaffce74e520702e16e3e21ea0f168940/video/out/vo_gpu_next.c#L1369 Can the secondary fbo created once and reused as in the code above, it looks like being created on the fly and destroyed after screenshot completion?

If you do it on every frame, it's probably better to create it once and reuse it. But for occasional use it shouldn't matter.

Also wondering if it is possible during window creation fbo->params.host_readable can be forced to true (interested in cross platform, not only glfw-d3d11) so that extra steps of rendering to secondary texture and pl_tex_blit() could be avoided?

Not sure about d3d11, maybe CC @kasper93

Have you considered using Vulkan?

spedagadi commented 4 months ago

On windows with glfw-vk set for window creation, call to pl_map_avframe_ex is failing https://github.com/haasn/libplacebo/blob/1fd3c7bde7b943fe8985c893310b5269a09b46c5/demos/plplay.c#L226 and I have not investigated fully. Using glfw-d3d11 to make progress for the time being.

On the other hand, some ARM chips like RK3588 have no support for vulkan by the looks of it and for them, it looks like glfw-gl with opengl es is atleast working (had to disable upscaling filter- which is another topic for later)

spedagadi commented 4 months ago

Implemented the use of second texture. Png of rendered frame looks as such

dvr-000000 This is a 4K frame with overlay but it looks like there is some scaling issue of overlay as it is too small to what I actually see on screen as below ( a picture taken of 1080p monitor and content is 4K) IMG_1924

The other issue is after pl_tex_blit() to the swapchain fbo, the screen gets blank indicating I am not doing something correctly. Here is the code https://github.com/spedagadi/eva-dtm/blob/83537b37d7233a06ca990b25fc5d184cb60cda42/src/main.c#L783 Any thoughts on what I could be doing wrong?

Also, the download of rendered frame to avframe and subsequent encoding is slowing rendering down on a 2080Ti (even tried disabling png encoding in ffmpeg - just kept the pl_download_avframe ) so looks like 4K frame download to CPU is somewhat taxing. Can I use a 'delayed' capture with 1-2 frames so rendering runs in a separate thread to capture and does libplacebo allow such buffering?

haasn commented 4 months ago

In your code you are rendering (and then downloading) the screenshots at the source resolution, not the final output resolution. I thought the intent was the latter, but your code does the former. That will be the main source of the performance difference.

To solve the issue of frame downloads blocking the encoding thread, the best solution would be to move the entire download process to a separate worker thread - so pl_download_frame + swscale + writing to PNG. That should speed things up considerably. Just need to make sure to synchronize the pl_frame in this case; probably what I would do is just pl_tex_create each screenshot target, then pl_tex_destroy it in the worker thread after pl_download_frame.

spedagadi commented 4 months ago

thnx, it makes sense. will implement these. Closing this issue as primary objective was achieved.