Add NVENC Video Encode and Decode GPU

My12123 commented 1 year ago

[x] Add a GPU to encode NVENC video #5
[ ] Add a GPU to decode NVENC video

My12123 commented 1 year ago

@Kiteretsu77 How to fix it? decode too slow, needs to sleep 0.4s 2023-08-30_15-36-52 Examples of codes for video decoding:

import cv2
import nvinfer

# Create a video capture object
cap = cv2.VideoCapture("i.mp4")

# Create a NVENC decoder object
decoder = nvinfer.createDecoder()

# Initialize the decoder
decoder.init()

# Start decoding frames
while True:
    # Read a frame from the video capture object
    ret, frame = cap.read()

    # Decode the frame
    decoded_frame = decoder.decode(frame)

    # Display the decoded frame
    cv2.imshow("Decoded Frame", decoded_frame)

    # Check if the user wants to quit
    if cv2.waitKey(1) == 27:
        break

# Release the resources
cap.release()
decoder.release()
cv2.destroyAllWindows()

This code first creates a video capture object and a NVENC decoder object. It then initializes the decoder and starts decoding frames from the video capture object. The decoded frames are then displayed on the screen. The code also checks if the user wants to quit and quits if the user presses the Esc key.

To use this code, you will need to install the following Python libraries:

cv2: This library provides functions for capturing, displaying, and processing images and videos. nvinfer: This library provides functions for decoding and encoding video using NVIDIA's NVENC hardware encoder/decoder. You can install these libraries using the following commands:

pip install opencv-python pip install nvinfer Once you have installed the required libraries, you can run the code by executing the following command:

python decode_video_nvenc.py This will decode the video file i.mp4 and display the decoded frames on the screen.

import pynvml
import cv2

def decode_video_from_nvenc(video_path):
  # Get the NVENC device
  device = pynvml.nvmlDeviceGetHandleByIndex(0)

  # Create a decoder
  decoder = cv2.cuda.createVideoDecoder(device, video_path)

  # Decode the video
  while True:
    frame = decoder.read()
    if frame is None:
      break

    cv2.imshow("Decoded Video", frame)
    cv2.waitKey(1)

# Call the decode_video_from_nvenc() function
decode_video_from_nvenc("i.mp4")

This code first imports the pynvml and cv2 modules. The pynvml module is used to get information about the NVENC device, and the cv2 module is used to decode the video.

The decode_video_from_nvenc() function first gets the NVENC device by calling the nvmlDeviceGetHandleByIndex() function. It then creates a decoder by calling the cv2.cuda.createVideoDecoder() function.

The decode_video_from_nvenc() function then decodes the video frame by frame. It calls the decoder.read() function to get the next frame. If the frame is None, it means that the end of the video has been reached.

The decode_video_from_nvenc() function then displays the decoded frame on the screen by calling the cv2.imshow() function. It also waits for a key press before decoding the next frame.

To use this code, you need to have the pynvml and cv2 modules installed. You also need to have a video file that is compatible with NVENC.

import cv2
import pycuda.driver as cuda
import pycuda.autoinit

def decode_video(video_path):
  """Decodes a video from NVENC.

  Args:
    video_path: The path to the video file.

  Returns:
    A NumPy array of the decoded video frames.
  """

  # Create a NVENC encoder/decoder object.
  encoder_decoder = cv2.cuda.createVideoDecoder(video_path)

  # Create a buffer to store the decoded frames.
  frame_buffer = cuda.alloc_managed_memory(encoder_decoder.getFrameSize())

  # Decode the video frames.
  num_frames = encoder_decoder.getFrameCount()
  for i in range(num_frames):
    encoder_decoder.decodeFrame(frame_buffer)

  # Convert the decoded frames to a NumPy array.
  decoded_frames = frame_buffer.reshape((num_frames, encoder_decoder.getHeight(), encoder_decoder.getWidth(), 3))

  # Return the decoded frames.
  return decoded_frames

if __name__ == "__main__":
  video_path = "path/i.mp4"
  decoded_frames = decode_video(video_path)

  # Do something with the decoded frames.

This code first creates a NVENC encoder/decoder object. It then creates a buffer to store the decoded frames. The code then decodes the video frames and converts them to a NumPy array. Finally, the code returns the decoded frames.

To use this code, you will need to install the following libraries:

Python 3.10 OpenCV PyCUDA You can install these libraries using the following commands:

pip install python3 pip install opencv-python pip install pycuda Once you have installed the required libraries, you can run the code by executing the following command:

python decode_video.py This will decode the video file specified by the video_path argument and print the decoded frames to the console.

Examples of video encoding codes:

import cv2
import pycuda.driver as cuda
import pycuda.autoinit

def encode_video(video_path, output_path, codec, bitrate):
  """Encodes a video with NVENC.

  Args:
    video_path: The path to the video file.
    output_path: The path to the output video file.
    codec: The video codec.
    bitrate: The bitrate.

  Returns:
    None.
  """

  # Create a NVENC encoder object.
  encoder = cv2.cuda.createVideoEncoder(output_path, codec, bitrate)

  # Create a buffer to store the encoded frames.
  frame_buffer = cuda.alloc_managed_memory(encoder.getFrameSize())

  # Read the video frames.
  cap = cv2.VideoCapture(video_path)
  while cap.isOpened():
    ret, frame = cap.read()
    if not ret:
      break

    # Encode the video frame.
    encoder.encodeFrame(frame_buffer)

  # Close the video file.
  cap.release()

  # Destroy the encoder object.
  encoder.destroy()

if __name__ == "__main__":
  video_path = "path/to/i.mp4"
  output_path = "path/to/o.mp4"
  codec = cv2.VideoWriter_fourcc(*"H264")
  bitrate = 1000000

  encode_video(video_path, output_path, codec, bitrate)

import cv2
import pycuda.driver as cuda
import pycuda.autoinit

def encode_video(video_path, output_path):
  """Encodes a video with NVENC.

  Args:
    video_path: The path to the video file to be encoded.
    output_path: The path to the output video file.

  Returns:
    None.
  """

  # Create a NVENC encoder object.
  encoder = cv2.cuda.createVideoEncoder(output_path, cv2.VideoWriter_NVEnc)

  # Create a buffer to store the encoded frames.
  frame_buffer = cuda.alloc_managed_memory(encoder.getFrameSize())

  # Open the video file.
  video_capture = cv2.VideoCapture(video_path)

  # Iterate over the video frames.
  while True:
    # Read the next frame.
    ret, frame = video_capture.read()

    # Encode the frame.
    encoder.encodeFrame(frame_buffer, frame)

    # Check if the end of the video has been reached.
    if not ret:
      break

  # Close the video file.
  video_capture.release()

  # Close the encoder.
  encoder.release()

if __name__ == "__main__":
  video_path = "path/i.mp4"
  output_path = "path/o.mp4"
  encode_video(video_path, output_path)

Code for decoding and encoding video using NVENC:

import cv2
import pycuda.driver as cuda
import pycuda.autoinit

def decode_and_encode_video(video_path, output_path, bitrate):
  """Decodes and encodes a video using NVENC.

  Args:
    video_path: The path to the video file to be decoded and encoded.
    output_path: The path to the output video file.
    bitrate: The bitrate of the encoded video.
  """

  # Create a NVENC encoder/decoder object.
  encoder_decoder = cv2.cuda.createVideoDecoder(video_path)

  # Create a buffer to store the decoded frames.
  frame_buffer = cuda.alloc_managed_memory(encoder_decoder.getFrameSize())

  # Decode the video frames.
  num_frames = encoder_decoder.getFrameCount()
  for i in range(num_frames):
    encoder_decoder.decodeFrame(frame_buffer)

  # Encode the decoded frames.
  encoder = cv2.cuda.createVideoEncoder(output_path, cv2.VideoWriter_NVEnc)
  for i in range(num_frames):
    encoder.encodeFrame(frame_buffer, bitrate)

  # Close the encoder.
  encoder.release()

if __name__ == "__main__":
  video_path = "path/to/video.mp4"
  output_path = "path/to/output.mp4"
  bitrate = 1000000

  decode_and_encode_video(video_path, output_path, bitrate)

This code first creates a NVENC encoder/decoder object. It then creates a buffer to store the decoded frames. The code then decodes the video frames and encodes them. Finally, the code closes the encoder.

To use this code, you will need to install the following libraries:

Python 3.10 OpenCV PyCUDA You can install these libraries using the following commands:

pip install python3 pip install opencv-python pip install pycuda Once you have installed the required libraries, you can run the code by executing the following command:

python decode_and_encode_video.py This will decode the video file specified by the video_path argument, encode it, and save the encoded video to the file specified by the output_path argument. The bitrate of the encoded video is specified by the bitrate argument.

I hope this helps!

Kiteretsu77 commented 1 year ago

This comment appears when your computation resources cannot catch up the fast decode speed. I think that what I wrote is confusing. I will change it.

Kiteretsu77 commented 1 year ago

The new branch supports GPU encoding as the default option (you can change it in the config.py if you don't want). Moreover, based on my test, GPU encoding accelerates the programa little bit. If you have some extra GPU memory (~extra 1.5 GB is needed for 480P input (before Supre-Resolution)), I recommend you to use this default (GPU) encoding way.

Kiteretsu77 commented 1 year ago

The new branch will support GPU encoding as default option (you can change it in the config.py if you don't want). Moreover, based on my test, GPU encoding help the program accelerate quite a lot. If you have a little bit extra GPU memory (~extra 1.5 GB is needed for 480P input (before Supre-Resolution)) unusued, I recommend you to use this default encoding way.

For GPU decoding, I don't have too much idea how to use moviepy to achieve this purpose. Also, I feel that when using hardware decode, we usually need to first transfer data from CPU memory to GPU. When the GPU decode them, they need to transfer back to CPU, which creates unneeded data transfer costs. If I use GPU decode, I also need to consider moving all CPU codes to CUDA-based. However, this will be a giant engineering task. Meanwhile, I need to think about how to redesign the whole struture to fully utilize the benefits of GPU decoding. Therefore, currently, I think that I need more time to consider GPU decoding.

My12123 commented 1 year ago

Why is CUDA used for encoding? It is better to use the NVENC core video. Why is the CPU loaded at 100%? Will video decoding be added? 2023-09-12_18-41-05 2023-09-12_18-42-53

Kiteretsu77 commented 1 year ago

For Windows task manager, I always feel that their utilization of GPU reports is inaccurate. What I used for encoding is actually the Nvidia Hardware encoder (-c:v hevc_nvenc) built in FFMPEG (access it through moviepy package). I think that the NVENC is actually used but the Windows task manager cannot show the information accurately.

Kiteretsu77 / FAST_Anime_VSR

Add NVENC Video Encode and Decode GPU #8