RomanArzumanyan / VALI

Video processing in Python
Apache License 2.0
21 stars 1 forks source link

Decode Error occurred for picture & HW decoder faced error. Re-create instance #31

Closed leyankon closed 3 months ago

leyankon commented 3 months ago

I am using the code from branch v2.1.0 and found a problem:

Decode Error occurred for picture 76303 HW decoder faced error. Re-create instance

HW decoder faced error. Re-create instance HW decoder faced error. Re-create instance HW decoder faced error. Re-create instance HW decoder faced error. Re-create instance HW decoder faced error. Re-create instance

This problem also appeared in earlier versions, and the solution is to catch the exception, e.g. https://github.com/NVIDIA/VideoProcessingFramework/issues/47 https://github.com/NVIDIA/VideoProcessingFramework/issues/353

while True:  
    try: 
        rawSurface = nvD.DecodeSingleSurface() 
        if rawSurface.Empty(): 
            print('No more video frames') 
            break 
    except nvc.HwResetException: 
        print('Continue after HW decoder was reset') 
        continue 

But in the branch v2.1.0, the HwResetException will not be thrown, why?

RomanArzumanyan commented 3 months ago

Hi @leyankon

What’s the exact problem you see with the latest vali version? In your original message you refer to v2.1.0 two times.

leyankon commented 3 months ago

When decoding an H.265 RTSP stream, once a frame parsing fails, the message "Decode Error occurred for picture 76303 HW decoder faced error. Re-create instance" keeps appearing continuously. However, despite this error message, the HwResetException exception is not thrown, and the RTSP connection is interrupted.

RomanArzumanyan commented 3 months ago

Latest vali returns you not only the decoded surface but also the status of decode task:

surface, details = nvDec.DecodeSingleSurface()

What’s the details value first time you see the error message?

leyankon commented 3 months ago

this is my code

import sys
import os

import cv2

if os.name == "nt":
    # Add CUDA_PATH env variable
    cuda_path = os.environ["CUDA_PATH"]
    if cuda_path:
        os.add_dll_directory(cuda_path)
    else:
        print("CUDA_PATH environment variable is not set.", file=sys.stderr)
        print("Can't set CUDA DLLs search path.", file=sys.stderr)
        exit(1)

    # Add PATH as well for minor CUDA releases
    sys_path = os.environ["PATH"]
    if sys_path:
        paths = sys_path.split(";")
        for path in paths:
            if os.path.isdir(path):
                os.add_dll_directory(path)
    else:
        print("PATH environment variable is not set.", file=sys.stderr)
        exit(1)

import pycuda.driver as cuda
import PyNvCodec as nvc
import numpy as np

def decode(gpuID, encFilePath, decFilePath):
    cuda.init()
    cuda_ctx = cuda.Device(gpuID).retain_primary_context()
    cuda_ctx.push()
    cuda_str = cuda.Stream()
    cuda_ctx.pop()

    nvDmx = nvc.PyFFmpegDemuxer(encFilePath, {'rtsp_transport': 'tcp'})
    nvDec = nvc.PyNvDecoder(
        nvDmx.Width(),
        nvDmx.Height(),
        nvDmx.Format(),
        nvDmx.Codec(),
        cuda_ctx.handle,
        cuda_str.handle,
    )
    nvCvt = nvc.PySurfaceConverter(
        nvDmx.Width(),
        nvDmx.Height(),
        nvDmx.Format(),
        nvc.PixelFormat.BGR,
        cuda_ctx.handle,
        cuda_str.handle,
    )
    nvDwn = nvc.PySurfaceDownloader(
        nvDmx.Width(), nvDmx.Height(), nvCvt.Format(), cuda_ctx.handle, cuda_str.handle
    )

    packet = np.ndarray(shape=(0), dtype=np.uint8)
    # frameSize = int(nvDmx.Width() * nvDmx.Height() * 3 / 2)
    rawFrame = np.ndarray((nvDmx.Height(), nvDmx.Width(), 3), dtype=np.uint8)
    pdata_in, pdata_out = nvc.PacketData(), nvc.PacketData()

    # Determine colorspace conversion parameters.
    # Some video streams don't specify these parameters so default values
    # are most widespread bt601 and mpeg.
    cspace, crange = nvDmx.ColorSpace(), nvDmx.ColorRange()
    if nvc.ColorSpace.UNSPEC == cspace:
        cspace = nvc.ColorSpace.BT_709
    if nvc.ColorRange.UDEF == crange:
        crange = nvc.ColorRange.MPEG
    cc_ctx = nvc.ColorspaceConversionContext(cspace, crange)
    print("Color space: ", str(cspace))
    print("Color range: ", str(crange))
    while True:
        try:
            # Demuxer has sync design, it returns packet every time it's called.
            # If demuxer can't return packet it usually means EOF.
            if not nvDmx.DemuxSinglePacket(packet):
                break

            # Get last packet data to obtain frame timestamp
            nvDmx.LastPacketData(pdata_in)

            # Decoder is async by design.
            # As it consumes packets from demuxer one at a time it may not return
            # decoded surface every time the decoding function is called.
            surface_nv12, details = nvDec.DecodeSurfaceFromPacket(pdata_in, packet, pdata_out)
            print("details", details)
            print("surface_nv12", surface_nv12)
            if not surface_nv12.Empty():
                surface_yuv420, _ = nvCvt.Execute(surface_nv12, cc_ctx)
                if surface_yuv420.Empty():
                    break
                if not nvDwn.DownloadSingleSurface(surface_yuv420, rawFrame):
                    break
                # cv2.imshow("", rawFrame)
                # if cv2.waitKey(1) & 0xFF == ord('q'):
                #     break
        except nvc.HwResetException:
            print("HW----------")

if __name__ == "__main__":

    print("This sample decodes input video to raw YUV420 file on given GPU.")
    print("Usage: SampleDecode.py $gpu_id $input_file $output_file.")

    # if len(sys.argv) < 4:
    #     print("Provide gpu ID, path to input and output files")
    #     exit(1)

    gpuID = 0
    # encFilePath = sys.argv[2]
    # decFilePath = sys.argv[3]

    decode(gpuID, "rtsp://192.168.1.165:554/rtp/34020000001320000251_34020000001320000001", "./hhh.mp4")

My environment is ubuntu22.04, and the cuda version is 11.8

This is the information printed by my console err.txt

These errors will occur during decoding, but they should not cause all subsequent decoding to fail.

Decode Error occurred for picture 6520
[rtsp @ 0x273df40] Multi-layer HEVC coding is not implemented. Update your FFmpeg version to the newest one from Git. If the problem still occurs, it means that your file has a feature which has not been implemented.
HW decoder faced error. Re-create instance.
RomanArzumanyan commented 3 months ago

Hi @leyankon

You ignore the decoder's output.

surface_nv12, details = nvDec.DecodeSurfaceFromPacket(pdata_in, packet, pdata_out)
if not surface_nv12.Empty():
    surface_yuv420, _ = nvCvt.Execute(surface_nv12, cc_ctx)

# what if surface_nv12 is empty and / or details has information about the decoding failure?

Decoder actually signals you that it has failed and Surface is empty (taken from err.txt):

#
# Things go well
#
details TaskExecInfo.SUCCESS
surface_nv12 
Width:            1920
Height:           1080
Format:           NV12
Pitch:            2048
Elem size(bytes): 1
Plane 0
  Owns mem:  1
  Width:     1920
  Height:    1620
  Pitch:     2048
  Elem size: 1
  Cuda ctx:  0x1e7fd80
  CUDA ptr:  124052874199040

#
# OK, this one is weird. I acknowledge this.
# I can't tell why status is SUCCESS but surface is empty.
#
details TaskExecInfo.SUCCESS
surface_nv12 
Width:            0
Height:           0
Format:           NV12
Pitch:            0
Elem size(bytes): 1
Plane 0
  Owns mem:  0
  Width:     0
  Height:    0
  Pitch:     0
  Elem size: 0
  Cuda ctx:  0
  CUDA ptr:  0

#
# Decoder failed to do it's job
#
details TaskExecInfo.FAIL
surface_nv12 
Width:            0
Height:           0
Format:           NV12
Pitch:            0
Elem size(bytes): 1
Plane 0
  Owns mem:  0
  Width:     0
  Height:    0
  Pitch:     0
  Elem size: 0
  Cuda ctx:  0
  CUDA ptr:  0

Long story short: PyNvDecoder class behavior in VALI has changed since latest public VPF. It was done so to make PyNvDecoder behavior more informative. It may now return multiple decoding statuses such as e. g. SUCCESS, FAIL, EOF without using stderr (which may be unavailable in many production cases).

This new behavior is illustrated in VALI documentation. Please follow it.

leyankon commented 3 months ago

@RomanArzumanyan Thanks. I added this code and now it resets the decoder when an error occurs.

surface_nv12, details = nvDec.DecodeSurfaceFromPacket(pdata_in, packet, pdata_out)
if details == nvc.TaskExecInfo.FAIL:
    nvDec = nvc.PyNvDecoder(
        nvDmx.Width(),
        nvDmx.Height(),
        nvDmx.Format(),
        nvDmx.Codec(),
        cuda_ctx.handle,
        cuda_str.handle,
    )
    continue
RomanArzumanyan commented 3 months ago

@leyankon

Good news, I'm glad that it helped. Closing issue as resolved.