nvpro-samples / vk_video_samples

Vulkan video samples
Apache License 2.0
252 stars 41 forks source link

H.264 picture buffer state management seems to go wrong #72

Open charlie-ht opened 6 months ago

charlie-ht commented 6 months ago

This test file seems to expose a bug in the H.264 picture management code,

I can get the clip to decode "correctly" by using this workaround,

diff --git a/vk_video_decoder/libs/NvVideoParser/src/VulkanH264Parser.cpp b/vk_video_decoder/libs/NvVideoParser/src/VulkanH264Parser.cpp
index 6b08277..91813db 100644
--- a/vk_video_decoder/libs/NvVideoParser/src/VulkanH264Parser.cpp
+++ b/vk_video_decoder/libs/NvVideoParser/src/VulkanH264Parser.cpp
@@ -2888,7 +2888,7 @@ void VulkanH264Decoder::dpb_picture_start(pic_parameter_set_s *pps, slice_header
         iCur = MAX_DPB_SIZE;
         // initialize DPB frame buffer
         cur = &dpb[iCur];
-        if (cur->state != 0)
+        if (cur->pPicBuf != nullptr /* cur->state != 0 */)
         {
             output_picture(iCur, dpb[iCur].state);
         }

But I get pictures not supposed to be presented with this workaround. I'm unsure where in the picture management code something goes wrong exactly.

zlatinski commented 6 months ago

Thank you, Charlie! The issue here is that we are running out of framebuffer pictures. This content uses a huge DPB space (of 16 pictures) pre-allocated before any display operations. The pattern is IBBBPBBBP. Since this content is very simple, the decoder is running ahead of the presentation with a lot of buffers. You can fix that by increasing the ProgramConfig::numDecodeImagesInFlight from 8 to 9 or 10. Perhaps a better alternative would be to decrease the display queue size via ProgramConfig::backBufferCount from 8 to 3. This will also improve the presentation latency.

charlie-ht commented 6 months ago

Thank you Tony! Increasing numDecodeImagesInFlight to 10 got the sequence decoded correctly. Is there a principled way we can set this variable to something that will work based on the input sequence / other command line parameters? Decreasing the display queue size from 8 to 3 didn't help here, but perhaps it needs to happen in combination with the other suggestion.

zlatinski commented 6 months ago

As I have said above, a better (the right fix) alternative would be to decrease the display queue size via ProgramConfig::backBufferCount from 8 to 3. This will also improve the presentation latency, but because of the issue with the open-source swapchain implementation, we can't currently do that. I hope to find time to debug the issue with that soon.