xbmc-imx6 / xbmc

XBMC Main Repository
http://xbmc.org
Other
32 stars 5 forks source link

About PTS #32

Open wolfgar opened 10 years ago

wolfgar commented 10 years ago

Intro

I open this new thread following the discussion in #8 and the recent PR #28. The point is to share about the best solution to implement properly pts (presentation timestamp) association with output frames.

What's the question about ?

Encoded frames are pushed out of the demuxer in ::Decode function. The encoded unit is associated with a pts, at least for decent (ie no AVI) containers. We will have to keep proper association between this encoded data and the resulting output picture as the pts is useful to know when the final output pictures (retrieved in ::Getpicture) have to be displayed. Conceptually it is pretty simple but the VPU library does not provide any help to track these pts and we have to implement this logic by ourselves. During the initial implementation we relied upon a freescale library mfw_gst_ts which is provided with their gstreamer plugins. Even if this library has some assets (especially, when MODE_AI is selected and when functions with a '2' suffix are used) it also has some subtle issues that result in inappropriate time corrections (even when we really strive to push only pts that have to be queued) and it is considered overcomplicated for integration in this xbmc implementation.

One could think a simple FIFO would just be fine but it is not as simple :

Before explaining further the current implementation let's remind how VPU processes data : A set of buffers (contiguous in memory so that they can used for DMA) are allocated for VPU use. These buffers are used to store decoded pictures. When encoded data is submitted to the VPU to be processed we retrieve 2 indexes : an index for consumed frame and an index for display frame. If these indexes are positive value they point to a VPU buffer. This way of working is really the one which occurs at low level (at the interface with the DSP) If a consumed index is returned, the flag VPU_DEC_ONE_FRM_CONSUMED is set and if a display index is available the flag VPU_DEC_OUTPUT_DIS is set out of VPU_DecDecodeBuf function (this function is provided by libfslvpuwrap library) Note that, we operate the VPU so that it ensures display pictures are issued in order.

Back to pts handling, it whole idea of the current implementation is to associate the pts to the VPU buffer when the frame is consumed. That way we are sure that we will retrieve the correct pts associated to this buffer when the picture will be out for display. Also note that doing so will "naturally" reorder the pts at the same time as the display buffers are reordered. If is not the case it would be the proof that something goes wrong. And in fact this anomaly was witnessed with some livetv stream at some stage : This is exactly what PR #28 addresses.

As an alternative, an implementation based on an sorted queue was proposed in PR #8. The approach is of course perfectly valid : It addresses the 1st drawback of FIFO approach. Yet the 2 others items have still to be addressed to get a robust implementation based on sorted queue.

Conclusion

I hope that this little post can be helpful to clarify the whole pts subject and will help to decide what we want to do for the definitive implementation...

smallint commented 10 years ago

Thanks a lot Stephan. Since associating pts with an output buffer as in #28 should not make a difference as long as the VPU reorders frames correctly than using a priority queue I would go for the current approach as in #28. The main reasons for that are:

koying commented 10 years ago

As a follow-up to #42 , the fact that VPU_DEC_ONE_FRM_CONSUMED is not generated by all codecs is a big issue. Passing no pts back to XBMC apparently does not work flawlessly, especially when frame drops occurs, and will most probably be refused by the XBMC reviewers, anyway. So some workaround has to be found...

wolfgar commented 10 years ago

You are right : the no pts / no VPU_DEC_ONE_FRM_CONSUMED have to be properly handled. They are not at this time and I don't have a proper workaround right now. Especially, the no VPU_DEC_ONE_FRM_CONSUMED is annoying because I don't know about a good mean to know for sure whether we have to store the ts or not ...

koying commented 10 years ago

I was thinking about VPU_DEC_ONE_FRM_CONSUMED, what it does and the differences with VPU_DEC_OUTPUT_DIS. My understanding is that, due to the fact we are asking the vpu to do pts re-ordering, VPU_DEC_ONE_FRM_CONSUMED returns a pointer to the output frame in its internal sorted queue corresponding to the input frame it just handled, right? If so:

Thinking aloud, but if we disable VPU pts re-ordering and doing the re-ordering ourselves, wouldn't VPU_DEC_ONE_FRM_CONSUMED = VPU_DEC_OUTPUT_DIS for all codecs and we could then do all pts handling in VPU_DEC_OUTPUT_DIS, including returning the dts to XBMC if needed ?

smallint commented 10 years ago

If your assumption is correct (I would agree) wouldn't it be enough to set a flag between the two VPU_DEC_OUTPUT_DIS events when VPU_DEC_ONE_FRM_CONSUMED was emitted and use pts/dts in VPU_DEC_OUTPUT_DIS otherwise? This would even work when those modes are mixed though I am not sure if this is possible. And we don't need to do reordering ourselves. Just thinking loudly. I haven't tested the examples yet.

koying commented 10 years ago

That deserves a try, for sure.

wolfgar commented 10 years ago

Both comments are very interesting paths that are definitively worthy to explore yes ! It is inspiring, thanks for your feedback..

wolfgar commented 10 years ago

I have just run a few set of tests and unfortunately even when we do not ask for reordering the indexes are not used nor output in a pure sequential order. While digging a little in vpu documentation and libraries source code, it appears that the reordering option is only valid for AVC codec and is ignored for others (and especially the VC1 case study) During this investigation I have just found that, at low level, unfortunately, consumed frame and displayed frame are not fully equivalent (I have low level VPU logs where a display frame is set but no consumed frame). But there is a very good news : at low level, the VPU properly reports the consumed frame index even when VC1 is used ! We are victim of an artificial filtering by libfsvpuwrap here... It means that in reality, we have a proper report from VPU that the frame is consumed and associated to a specific buffer even for vc1...

wolfgar commented 10 years ago

As a complement, I have to be clear that he definitive fix has still to be implemented when no pts is provided and only dts is. In this case I cannot imagine another solution than the use an ordered queue filled by dts... This feeding should occur on frame consumption as we know for sure that the frame has been taken into account by vpu and will (except in rare error cases) produce one output frame...

emveepee commented 10 years ago

Excuse my naïveté but why does pts seem to be preferred over dts? Many of my broadcast mpeg-ts files have (at the the transport stream level) pts that can jump down to zero or are stitched together but the dts is almost always correct and sequential.

Martin

wolfgar commented 10 years ago

Hi Martin, pts are better than dts because they really say : that frame has to be displayed at this timestamp. If you look at the pts out of the demuxer they look a little crazy and in a bad order but they are not so crazy. The fact is that encoded frames are not (at least for some codec) in chronological order so the attached pts are neither... There are containers that do not provide pts but only dts (for instance avi). In this case we revert to the use of dts...

emveepee commented 10 years ago

Ok I don't know much about the logic in xbmc, I just know that many players have trouble when the pts on an mpeg-ts rolls over to zero.

Martin

wolfgar commented 10 years ago

Ho ok, If you have a specific file, you can share it with us and we will check if it is handled properly...

emveepee commented 10 years ago

Ok I uploaded pts_gap.ts to your server. I does seem ok, the pcr/pts change is at 28 seconds

Martin

wolfgar commented 10 years ago

yep, thanks : tested OK with our xbmc build there is very likely a filtering out of demuxer because from my decoder point of view the pts do not exhibit the gap and I had to check with another player to be sure about its location ;)