NVIDIA / VideoProcessingFramework

Set of Python bindings to C++ libraries which provides full HW acceleration for video decoding, encoding and GPU-accelerated color space and pixel format conversions
Apache License 2.0
1.31k stars 234 forks source link

Ability to work along side PyAV? #99

Closed vade closed 2 years ago

vade commented 4 years ago

Describe the bug Hello - VPF appears to be functioning decently in a simple test case - however it hides / abstracts / removes (?) a lot of libAV functionality in python behind a simple interface.

Is it possible to use pyAV to vend compressed packets of HEVC or AVC data and send to VPF ? This would allow for nice flexibility of having most of libAV at your disposal (audio, re-muxing, time stamps, etc) while having access to the speed of VPF .

Am I misunderstanding the API or is this possible today with VPF?

Thank you.

riaqn commented 3 years ago

@rarzumanyan Thank you. Yes indeed I can just use ffplay. (surprisingly the format is yuv420p instead of yuv444p.)

ffplay -f rawvideo -pixel_format yuv420p -video_size 7680x3840 -i decoded.dump

I couldn't notice any artifacts. Let me find a smaller example and upload it here. Do you want each packets to be separated, or a single binary file with concatenated packets is fine?

riaqn commented 3 years ago

Hello @rarzumanyan , I got a smaller example. Here is the bitstream file. It's gzip-ed. So you might want to uncompress it first. I ran hm on it and the decoded raw YUV file plays perfectly. But when I ran it through NVC the returned Surface is always empty. Can you confirm please?

out.dump.gz

rarzumanyan commented 3 years ago

Hi @riaqn

This file causes both SampleDecode.py and SampleDemuxDecode.py to segfault when the decoded surface is downloaded to Host memory, something is wrong with Cr chroma plane.

By the content I assume this is fragment of Tears Of Steel. Can you decode the original video file (before you demux it with PyAV) using VPF?

I'll be honest - I don't have a bandwidth to debug a bitstream which was made with PyAV demuxer from experimental branch.

riaqn commented 3 years ago

This file causes both SampleDecode.py and SampleDemuxDecode.py to segfault when the decoded surface is downloaded to Host memory, something is wrong with Cr chroma plane.

That's actually different from the error I have (Surface always empty). Just to be sure: the file I uploaded is obtained by demuxing a video file by PyAV, filtered using hevc_mp4toannexb, then mux_one into a output container; just like your example in https://github.com/NVIDIA/VideoProcessingFramework/issues/99#issuecomment-744408969)

By the content I assume this is fragment of Tears Of Steel. Can you decode the original video file (before you demux it with PyAV) using VPF?

Do you mean pass the mp4 file directly to PyNvDecoder and use its builtin demuxer? I'm quite confident that will just work, but I can try.

I'll be honest - I don't have a bandwidth to debug a bitstream which was made with PyAV demuxer from experimental branch.

Sure I understand. In the worst case I will just use VPF's demuxer.

riaqn commented 3 years ago

@rarzumanyan More update: if I run the following command

ffmpeg -i Tears_400_x265.mp4 -codec copy -bsf:v hevc_mp4toannexb tears.hevc

the obtained tears.hevc is identical (same hash) to the out.dump I uploaded. Therefore I think the file is probably correct?

Some other observation: In your example (which I followed), the packet has to go through mux_one. I find that all packets stays the same size except the first packet (which is a key frame), which has increased in size by dozens of bytes. Is this expected? Or is some unwanted header info being added to the beginning of the file?

riaqn commented 3 years ago

@rarzumanyan A bit update: the SampleDemuxDecode.py works well on the original mp4 file. So I modified the code so that it writes packets to a file. Note this file is given by PyFFmpegDemuxer. Turns out... the dumped file is same (checked by hash) to the file generated by pyav, and same to the file generated by ffmpeg CLI.

So I guess it's not the demuxer, but my decoder that's wrong. I will check. EDIT: OK I finally worked it out... So yuv420 is actually different from yuv420p... The later is also called nv12... my pixel format translation from pyav to nvc is wrong... Guess I paid my price for video codec intro. hehe.

rarzumanyan commented 2 years ago

No response from PyAV for a long time, closing this epic thread.

InkosiZhong commented 1 year ago

@lferraz Hi, I'm wondering if you have solve the wrong flag=K_ when muxing with PyAV. I'm trying to use VPF to seek a specified frame from a video muxed with PyAV and encoded by VPF. Here is a demo of my code.

# encode and mux
enc_array = bytearray(enc_frame) # this enc_frame is generated by EncodeSingleSurface
packet = av.packet.Packet(enc_array)
packet.stream = self.stream
packet.dts = xxx
packet.pts = xxx
container.mux_one(packet)

# decode
seek_ctx = nvc.SeekContext(
            seek_frame=15, seek_criteria=nvc.SeekCriteria.BY_NUMBER)
raw_surface = nv_dec.DecodeSingleSurface(seek_ctx)

# I got error
Decode Error occurred for picture 0
HW decoder faced error. Re-create instance.
HW decoder reset time: 18 milliseconds 
Traceback (most recent call last)
    raw_surface = nv_dec.DecodeSingleSurface(seek_ctx)
PyNvCodec.HwResetException: HW reset 

I try to use ffprobe to compare this video with a video encoded with PyAV directly

ffprobe -show_packets test.mp4

I find that the only different is the flag=K_ problem.

By the way, if I don't mux with PyAV, I will get another error #422.

  1. I was wondering if there is a way to fix the flag problem.
  2. If not, is there a way to make VPF normally seek the frame from the video encoded by itself?
lferraz commented 1 year ago

@InkosiZhong , as far as I remember at the end I discarded PyAV because all these issues... sorry.