Closed chacha21 closed 1 month ago
BTW, where do I download a recent vplswref64.dll ? I can't tell where mine comes from, but it is not built by the current libvpl git project [edit] found here : https://www.intel.com/content/www/us/en/developer/articles/tool/oneapi-standalone-components.html#inpage-nav-8-9
does not solve the bug
This vplswref64.dll was a component leveraging CPU process, and we don't support it anymore. We support only gpu runtime only which it comes with gfx driver. For that, please check the hello-decode sample.
This vplswref64.dll was a component leveraging CPU process, and we don't support it anymore. We support only gpu runtime only which it comes with gfx driver. For that, please check the hello-decode sample.
I don't have a development machine with Intel GFX, so I am bound to use "vplswref64.dll".
I don't think that it will be relevant to this thread, I mentioned it because the sample code I provided expects the vpl run-time dlls to be deployed on the host machine, they are not part of the project; so for any one who wants to test, this reference was needed at least to be able to run the code properly.
The original intention of cpu runtime was to provide "reference" functionality as you mentioned. But we discontinued it. CPU runtime might have had the issue to deal with the input you feed. So, I recommend you to have Intel device and try.
The original intention of cpu runtime was to provide "reference" functionality as you mentioned. But we discontinued it. CPU runtime might have had the issue to deal with the input you feed. So, I recommend you to have Intel device and try.
I'll try to find a host machine with compatible Intel HD graphics. But I'm curious to know if you can test with HW acceleration and see the bug yourself.
The original intention of cpu runtime was to provide "reference" functionality as you mentioned. But we discontinued it. CPU runtime might have had the issue to deal with the input you feed. So, I recommend you to have Intel device and try.
I'll try to find a host machine with compatible Intel HD graphics. But I'm curious to know if you can test with HW acceleration and see the bug yourself.
What is your goal? Do you want to decode frame by frame? or to decode a video stream and this I frame test is for experiment VPL?
What is your goal? Do you want to decode frame by frame? or to decode a video stream and this I frame test is for experiment VPL?
I am evaluating vpl as an alternative backend engine for encoding and decoding sequences. I already use Microsoft Media Foundation, CUDA NVEnc, and had a IMSDK implementation in the past.
My use case is : -encoding : either stream, or sequence of images -decoding : either stream, or sequence of images to be randomly accessed frame by frame (not a simple incremental playback).
For decoding, I am pretty familiar with NALUs, I can parse raw samples if needed, and I am already able to determine I, P and B frame in order to submit enough data for each frame, so I really focus on vpl as the last step of decoding.
Currently I am experimenting VPL, but since I do not have compatible hardware, I thought that I could rely on the SW reference implementation to get the best tests, even without full performance.
What is your goal? Do you want to decode frame by frame? or to decode a video stream and this I frame test is for experiment VPL?
I am evaluating vpl as an alternative backend engine for encoding and decoding sequences. I already use Microsoft Media Foundation, CUDA NVEnc, and had a IMSDK implementation in the past.
My use case is : -encoding : either stream, or sequence of images -decoding : either stream, or sequence of images to be randomly accessed frame by frame (not a simple incremental playback).
For decoding, I am pretty familiar with NALUs, I can parse raw samples if needed, and I am already able to determine I, P and B frame in order to submit enough data for each frame, so I really focus on vpl as the last step of decoding.
Currently I am experimenting VPL, but since I do not have compatible hardware, I thought that I could rely on the SW reference implementation to get the best tests, even without full performance.
Got it. Thank you for the detail information. I will try your code quickly and see whether there's anything missed.
Do you see "MFX_ERR_MORE_DATA" from this part?
mfxBitstream bs = {0};
bs.Data = rawFileContent.data();
bs.MaxLength = static_cast
Then, it won't be working because you are feeding mp4 stream, not video elementary stream. You probably know that mp4 is container, and you need to extract raw video data from each packet. VPL does not support any type of container.
Do you see "MFX_ERR_MORE_DATA" from this part?
No. I can't send a console log right now (AFK) but the MFX_ERR_MORE_DATA
that bothers me is this one :
if (decStatus == MFX_ERR_MORE_DATA)
printf("unexpected MFX_ERR_MORE_DATA, this is a mfx wrong behaviour\r\n");
To be comprehensive, please note that
And finally :
Can you share the code you modified? It fails at where I pointed out and can't reach there. Looks like you commented out some parts.
Can you share the code you modified?
I did not modify the code attached to the first post of this issue The code shows different strategies to initialize things and some errors are normal. I just put assert() for critical failures.
My console ouput is shown below :
Please check this code. It's dirty but I modified code, to load gpu runtime, to save I frame output and I added some comments. Please refer "hello-decode" or "sample_decode" for general implementation.
Ok, I see what you did. I will test on Monday, and if it works I will perform even more tests to check extensively and compare with what the doc claims or misleadingly suggests. Then only I will come back for feedback.
Ok, I have tested and there are many problems :
DecodeFrameAsync()
will return MFX_ERR_ABORTED
. According to the VPL doc, sending a null bitstream is supposed to be done at the end of stream, not end of frame. I think that's whyMFXVideoDecode_Reset()
between frames would be inefficient but could help. Actually it just does not work (it always returns an unexpected and unexplained error)That's why I asked about the goal of your final app, not your experiment. I did show you how you can decode I frame only. If you want to decode full frames, please refer hello-decode or sample_decode.
I did show you how you can decode I frame only.
My code can handle non-I frames, Thanks to the Media foundation part, I can read and accumulate samples starting from the previous I frame up to the targeted P frame, and send all the bytes at once to vpl through the bistream. Then I expect VPL to output a frame since it must have enough data (but unfortunately, certainly because of internal buffering, MFX_ERR_MORE_DATA is returned)
The problem is not to read non-I frames, it is to read two different frames (at random positions) : "flushing" seems impossible since the "null bs" trick is not usable.
Have you read hello-decode sample? "null bs" is not a trick, it's needed when you drain remained decoded frames. Once you're done with reading input streams, then you should set bs to null and ask VPL to decode all the streams in the buffer and return. In your case, when it returns MFX_ERR_MORE_DATA, please call MFXVideoDECODE_DecodeFrameAsync() with null bs until it returns MFX_ERR_MORE_DATA again. Let's say you feed, "IPPPP" and want to get the third and last P frames. Then, you call MFXVideoDECODE_DecodeFrameAsync() with input stream (IPPPP). And if VPL returns MFX_ERR_MORE_DATA, then call MFXVideoDECODE_DecodeFrameAsyn() with bs=NULL until it returns MFX_ERR_MORE_DATA again. I expect it give you I, P, P, P. P
Have you read hello-decode sample?
Sure, and I learnt nothing new
"null bs" is not a trick, it's needed when you drain remained decoded frames.
Apparently, you can only use it once at the end of the stream. Once it has been done to drain and get a frame (from I, or IP, or IPP, or IPPP...), subsequent calls to MFXVideoDECODE_DecodeFrameAsync() will return MFX_ERR_ABORTED and you cannot send a new bitstream to decode a new frame (I or IP or IPP...) at a totally different position.
That's right. Once you get MFX_ERR_MORE_DATA with bs null, it means no more data left and it will return real error in next call. So, your problem is.. you can't do this continuously but just once .. because decode process will be done once bs=null is given.
So, your problem is.. you can't do this continuously but just once .. because decode process will be done once bs=null is given.
Right. With the "bs=null" drain, you fixed the initial problem of this issue thread, that was "can't get a frame at all". But now that I can get a frame, I see that the next step is "I can't get a second frame", in a scenario where I don't read a stream sequentially, but let the user choose a random position in the sequence.
ok.. I don't really have the optimal solution right now but.. Why don't you try giving enough buffer to VPL, which VPL can return Nth frame - avoid MFX_ERR_MORE_DATA? Meanwhile, I will check more.
Please check this as well. https://intel.github.io/libvpl/latest/programming_guide/VPL_prg_decoding.html#bitstream-repositioning
Interesting, so MFXVideoDecode_Reset()
should "officially" be the answer for stream repositionning (I suspected it would be inefficient, but that might be wrong).
However, I mentioned from the beginning of that thread that in my sample code, MFXVideoDecode_Reset()
always returns an error, even with decodeParams manually filled with proper values.
I guess I'll have to investigate a little more (perhaps with a debugger to step in vpl source code) to get more clues about that.
@chacha21 Could you close this issue and open new one if you have any issue with MFXVideoDecode_Reset()?
@chacha21 Could you close this issue and open new one if you have any issue with MFXVideoDecode_Reset()?
I might or not open a new issue, depending on the following considerations (not clear with the docs) :
MFXVideoDecode_Reset()
is the correct way to decode frames at random positions. In that case, I do have a problem with MFXVideoDecode_Reset()
and can open a new issueMFXVideoDecode_Reset()
is expected to fail since the bs is considered aborted and thus MFXVideoDecode_Reset()
can't help. In that case I cannot claim I observe a bug (even if I can't make it work, but it wouldn't be a MFXVideoDecode_Reset()
problem)MFXVideoDecode_Reset()
can be used to flush before repositioning (very unlikely : the doc does not really tells that). In that case we are still bringing information to the current issue.Closed due to no further issues or blocking feedback from submitter.
Aaand, there never was a clear answer from the VPL team. See the last message above : those are pending questions.
@chacha21
We did not see a specific concern to address related to the original question.
Can you please clarify what you are looking specifically?
I am still looking for a way to perform stream repositioning.
Hi @chacha21 We consulted with the VPL GPU Runtime team on this, as the behavior comes from their source code.
They confirmed MFXVideoDECODE_Reset() failing is expected behavior in the following sequence:
And the flush step is not needed if you are jumping to a different location in the bitstream using MFXVideoDECODE_Reset()
Based on your comments in this thread their feedback is that if you are looking to jump to different locations: "this should be done using Reset() and then feeding enough data so that new sequence header is found and decoding may proceed."
This aligns with your scenario 2 above.
If you have more questions regarding the documentation and/or usage senario of MFXVideoDECODE_Reset() a more expedient route would be to file an issue directly with the VPL GPU Runtime team so they can respond directly: https://github.com/intel/vpl-gpu-rt/issues
This does not appear to be a VPL issue, and further questions should be directed to the VPL-GPU-RT team, per the comments above. Closing this issue.
VPL 2.10.1 (not a regression, it did not work with previous version either)
I want to use VPL to decode a H264 sequence embedded in MP4 container (link below to data and sample code). I use Microsoft Media Foundation to query the raw encoded samples from the file I submit the samples to a properly initialized mfxSession For the very first sample (which is a valid I frame), MFX_ERR_MORE_DATA is issued.
By pushing more and more samples, I can finally get some decoded data, but this is not expected behaviour
I want the decoding session to provide the decoded data synchronously when all the required data for an I frame has been submitted.
TestMFTVPL.zip
Is this a VPL design concern ?