Xilinx / video-sdk

https://xilinx.github.io/video-sdk
Other
30 stars 14 forks source link

Lookahead help #80

Closed robashton closed 10 months ago

robashton commented 11 months ago

I assume I'm doing something badly here, but I can't for the life of me work out what! I've ignored #79 for now and set up an native environment which has about 80% of our build deps available and I've done the necessary dance to get our video frames/packets into the decoder/encoder/scaler and all of that is working just great. (Bonus screenshot attached of Xilinx sat inside our workflows)

image

I can't for the life of me persuade the lookahead filter to function when I've got a hardware decoder sat upstream of it however.

In my simplest test, the functionality should loosely be mirroring what is going on in the u30 transcoder example (although the interaction is more like Ffmpeg's implementation, in that we have allocated frames 'flowing' around with their own ref counted/gc system and not just a single state machine with fixed buffers.).

I don't feel there is an awful lot to go wrong here (ref counting and side data aside), and yet it does.

I'm getting 6 frames out of my decoder after sending 10 frames in and then it's just refusing to accept input or provide further output . Now obviously (obviously?) the lookahead filter adds a ref to the buffer when it is sent it which I assume it will hold onto until it is done with the data - so it therefore looks up the decoder pool that buffer came from and increases the size of that decoder pool by the required amount. The decoder definitely knows about this because it goes on to increase the size of its HDR buffer pool when it hits its next cycle.

So why does the decoder decide it cannot receive data any more? Did I miss a step? I've turned on debug logging and done side by side comparisons between the u30 transcode demo and what my code ends up executing and they look very similar.

Any pointers for how to further debug this would be much appreciated.

NastoohX commented 11 months ago

Hi, Sorry for the late reply. It can be a bit challenging to get a transcode pipeline going smoothly; however, the following may help with your debug effort: 1- Closely go over https://github.com/Xilinx/video-sdk-u30-examples/blob/61b0e6fd24cd9d268ebd627b68bdcc3722b20d05/examples/u30/xma/transcoder/lib/src/xlnx_transcoder.c#L514C4-L514C27, xlnx_tran_frame_process function. In particular, note the return values from xma_dec_session_send_data and xma_dec_session_recv_frame. 2- Start by feeding your pipeline I and then IP only frames. Introduce B frames, when happy with proper operation. 3- If at all possible, attempt to perform debugging on a file and either step through, with gdb, or dump intermediary frames in a predetermined location, for later inspection. (Take a look at xlnx_utils_copy_dev_frame_to_host_frame for device to host translation.) Let me know if the above helps. Cheers,

robashton commented 11 months ago

Hi,

WRT to 1 + 2, I've based the code written almost entirely on the xlnx_transcoder demo (with some inspiration from the ffmpeg implementation because it obviously has a "AVFrame" concept which requires slightly different ref counting interaction.

If I feed the decoder nothing but Iframes, it means I get 8 frames out after putting 10 in, but I'm still stuck with being unable to read to it or write to it after that point. I've made a note of some of the other games I've played to try and poke the system into telling me why it's deciding to do that.

whatever it is, it's going to be small and subtle. The trace output by the transcode example and my own test is almost identical. The decode pool gets expanded in both cases by the required amount (11 in this test - LAH+1) when the first frame is delivered to LAH and in the example code that results in there being 21 frames alive at any one point out of the decoder, and in my own code it just stops at 10 frames regardless of that. If I comment out the LAH/encoder and just refuse to return the frames I see the exact same behaviour (even if I then go and increase the size of the decoder pool myself manually).

I guess I'm looking for some insight here into other variables that the decoder might use to determine that it doesn't want input and can't give me output in a world where you're asking it to give you more frames than its original pool would allow.

(Note: I've got some fairly reasonable other transcode workflows running just fine, with multiple ladders through the scaler, side by side passthrough direct to the encoder etc - and I'm relatively confident that I'm doing my buffer mgmt correctly or that would be a disaster!)

NastoohX commented 11 months ago

Hi, I have asked our engineering team for advice and assistance on this matter, and will provide update, when available. Meanwhile, can you provide feedback on the following: 1- Is the relevant tlv set in XmaParameter, e.g.,

... extn_params[param_cnt].name = (char*)XLNX_LA_EXT_PARAMS[EParamLADepth]; extn_params[param_cnt].user_type = EParamLADepth; extn_params[param_cnt].type = XMA_UINT32; ... , prior to calling xma_filter_session_create. 2- Have you notices any correlation between incoming frame rate and number of frames, after which decoder stops to respond

Cheers,

robashton commented 11 months ago

1) Yup, I've set every property that the transcode demo sets (and as a result of trying to eliminate differences between them, the values are all now identical as is the source, as closely as I can manage anyway)

2) I have not - I hadn't considered it so I've run a few tests, the stoppage always occurs at 'decoder has output the same number of frames as the original pool size', which is largely (from what I can tell) chosen from the profile/level/etc and isn't (apparently) affected by the framerate.

Thanks for pushing forwards with this, hopefully we'll find out I've just set a 0 a 1 somewhere..

NastoohX commented 10 months ago

Hi, Hum...not sure why would this be happening. Engineering provided a feedback that notes stalls like this happen when processed frames are not freed. Perhaps a review of life cycle of video frames from inception to their final consumption may provide some hints. Would selecting profiles that lead to larger pool sizes, resolve your issue, i.e., by hard coding the profile selection? If so, it may hint to video not being properly parsed. Cheers,

robashton commented 10 months ago

Stalls like this absolutely do happen when video frames are not freed - but it is my understanding that in this case that it is the lookahead plug-in that is very much adding its own ref to provided buffers and it will not release those until it has finished with the frame. (In the case of lookahead depth = 10, that would imply that it will aways have a ref on frames from N to N-9, where N is the output of the lookahead plug-in).

If I am wrong about that then that would be great, but from dumping out the ref counts before/after providing data to the LA plugin in both my own code and in the transcode demo code, this seems accurate. Seeing as I'm not getting any frames out of the LA plug-in because for this exact scenario the decoder stalls before I get a chance to then there is nothing I can do about releasing these frames myself.

My current minimum test-case doesn't even rely on the video being parsed to work out what profiles/etc we are dealing with (Although I have confidence in the parsing code, it is well battle-tested).

I am feeding frames from a test card into an x264 encoder with Baseline 3.0 and IDR frames only (min/max idr interval = 1 ) and feeding the frames out of that directly into the Xilinx decoder one at a time, and I am configuring the Xilinx decoder with those very same parameters without even looking at the bitstream!

I am certain I am doing something wrong, but it isn't one of those things (unless I'm wrong about the interactions outlined above)

NastoohX commented 10 months ago

Hi, As we do not have an actionable item to collaborate on, I am going to close this ticket. However, if you come across a particular function call or a code segment that you'll need further clarification on, do not hesitate to open a new ticket. Cheers,