ValueError: We need to ensure that the partition num is equals to the parallel num

C00reNUT commented 9 months ago

Hello, I am trying to upscale 1280 × 768 video (H.264 codec), using Real-ESRGAN and I am getting following error:

python main.py
/mnt/a0b764eb-cdc5-4f46-9a2e-e2f11deba631/PYTHON_CACHE/FAST_Anime_VSR/lib/python3.10/site-packages/transformers/utils/hub.py:124: FutureWarning: Using `TRANSFORMERS_CACHE` is deprecated and will be removed in v5 of Transformers. Use `HF_HOME` instead.
  warnings.warn(
We are going to process single videos located at input.mp4
Current supported input resolution for Super-Resolution is  defaultdict(<class 'list'>, {1280: [264, 768]})
This resolution 1280X768 is supported!
The full frame name is 1280X768 and partition frame name is 1280X264
ffmpeg version 6.1.1 Copyright (c) 2000-2023 the FFmpeg developers
  built with gcc 12.3.0 (conda-forge gcc 12.3.0-5)
  configuration: --prefix=/mnt/a0b764eb-cdc5-4f46-9a2e-e2f11deba631/PYTHON_CACHE/FAST_Anime_VSR --cc=/home/conda/feedstock_root/build_artifacts/ffmpeg_1706918361713/_build_env/bin/x86_64-conda-linux-gnu-cc --cxx=/home/conda/feedstock_root/build_artifacts/ffmpeg_1706918361713/_build_env/bin/x86_64-conda-linux-gnu-c++ --nm=/home/conda/feedstock_root/build_artifacts/ffmpeg_1706918361713/_build_env/bin/x86_64-conda-linux-gnu-nm --ar=/home/conda/feedstock_root/build_artifacts/ffmpeg_1706918361713/_build_env/bin/x86_64-conda-linux-gnu-ar --disable-doc --disable-openssl --enable-demuxer=dash --enable-hardcoded-tables --enable-libfreetype --enable-libharfbuzz --enable-libfontconfig --enable-libopenh264 --enable-libdav1d --enable-gnutls --enable-libmp3lame --enable-libvpx --enable-libass --enable-pthreads --enable-vaapi --enable-libopenvino --enable-gpl --enable-libx264 --enable-libx265 --enable-libaom --enable-libsvtav1 --enable-libxml2 --enable-pic --enable-shared --disable-static --enable-version3 --enable-zlib --enable-libopus --pkg-config=/home/conda/feedstock_root/build_artifacts/ffmpeg_1706918361713/_build_env/bin/pkg-config
  libavutil      58. 29.100 / 58. 29.100
  libavcodec     60. 31.102 / 60. 31.102
  libavformat    60. 16.100 / 60. 16.100
  libavdevice    60.  3.100 / 60.  3.100
  libavfilter     9. 12.100 /  9. 12.100
  libswscale      7.  5.100 /  7.  5.100
  libswresample   4. 12.100 /  4. 12.100
  libpostproc    57.  3.100 / 57.  3.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'input.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf58.29.100
  Duration: 00:00:06.04, start: 0.000000, bitrate: 1003 kb/s
  Stream #0:0[0x1](und): Video: h264 (High) (avc1 / 0x31637661), yuv420p(progressive), 1280x768, 999 kb/s, 25 fps, 25 tbr, 12800 tbn (default)
    Metadata:
      handler_name    : VideoHandler
      vendor_id       : [0][0][0][0]
Stream map '0:s:0' matches no streams.
To ignore this, add a trailing '?' to the map.
Failed to set value '0:s:0' for option 'map': Invalid argument
Error parsing options for output file tmp/subtitle.srt.
Error opening output files: Invalid argument
duration is  6.04
ffmpeg version 6.1.1 Copyright (c) 2000-2023 the FFmpeg developers
  built with gcc 12.3.0 (conda-forge gcc 12.3.0-5)
  configuration: --prefix=/mnt/a0b764eb-cdc5-4f46-9a2e-e2f11deba631/PYTHON_CACHE/FAST_Anime_VSR --cc=/home/conda/feedstock_root/build_artifacts/ffmpeg_1706918361713/_build_env/bin/x86_64-conda-linux-gnu-cc --cxx=/home/conda/feedstock_root/build_artifacts/ffmpeg_1706918361713/_build_env/bin/x86_64-conda-linux-gnu-c++ --nm=/home/conda/feedstock_root/build_artifacts/ffmpeg_1706918361713/_build_env/bin/x86_64-conda-linux-gnu-nm --ar=/home/conda/feedstock_root/build_artifacts/ffmpeg_1706918361713/_build_env/bin/x86_64-conda-linux-gnu-ar --disable-doc --disable-openssl --enable-demuxer=dash --enable-hardcoded-tables --enable-libfreetype --enable-libharfbuzz --enable-libfontconfig --enable-libopenh264 --enable-libdav1d --enable-gnutls --enable-libmp3lame --enable-libvpx --enable-libass --enable-pthreads --enable-vaapi --enable-libopenvino --enable-gpl --enable-libx264 --enable-libx265 --enable-libaom --enable-libsvtav1 --enable-libxml2 --enable-pic --enable-shared --disable-static --enable-version3 --enable-zlib --enable-libopus --pkg-config=/home/conda/feedstock_root/build_artifacts/ffmpeg_1706918361713/_build_env/bin/pkg-config
  libavutil      58. 29.100 / 58. 29.100
  libavcodec     60. 31.102 / 60. 31.102
  libavformat    60. 16.100 / 60. 16.100
  libavdevice    60.  3.100 / 60.  3.100
  libavfilter     9. 12.100 /  9. 12.100
  libswscale      7.  5.100 /  7.  5.100
  libswresample   4. 12.100 /  4. 12.100
  libpostproc    57.  3.100 / 57.  3.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'input.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf58.29.100
  Duration: 00:00:06.04, start: 0.000000, bitrate: 1003 kb/s
  Stream #0:0[0x1](und): Video: h264 (High) (avc1 / 0x31637661), yuv420p(progressive), 1280x768, 999 kb/s, 25 fps, 25 tbr, 12800 tbn (default)
    Metadata:
      handler_name    : VideoHandler
      vendor_id       : [0][0][0][0]
Stream map '0:a' matches no streams.
To ignore this, add a trailing '?' to the map.
Failed to set value '0:a' for option 'map': Invalid argument
Error parsing options for output file tmp/output_audio.m4a.
Error opening output files: Invalid argument
We get partition_num 1 and parallel_num 2
Traceback (most recent call last):
  File "/mnt/a0b764eb-cdc5-4f46-9a2e-e2f11deba631/Video/SR/FAST_Anime_VSR/main.py", line 107, in <module>
    main()
  File "/mnt/a0b764eb-cdc5-4f46-9a2e-e2f11deba631/Video/SR/FAST_Anime_VSR/main.py", line 97, in main
    parallel_process(input_path, output_path, parallel_num=configuration.process_num)
  File "/mnt/a0b764eb-cdc5-4f46-9a2e-e2f11deba631/Video/SR/FAST_Anime_VSR/process/single_video.py", line 220, in parallel_process
    parallel_configs = split_video(input_path, parallel_num)
  File "/mnt/a0b764eb-cdc5-4f46-9a2e-e2f11deba631/Video/SR/FAST_Anime_VSR/process/single_video.py", line 149, in split_video
    raise ValueError("We need to ensure that the partition num is equals to the parallel num")
ValueError: We need to ensure that the partition num is equals to the parallel num

C00reNUT commented 9 months ago

this is the config,

import os
os.environ["CUDA_VISIBLE_DEVICES"]="0"          # GPU device for inference

class configuration:
    def __init__(self):
        pass

    ######################################################  Frequently Edited Setting  ########################################################################################

    # Model exaplain: 3 models we current support: Real-CUGAN + Real-ESRGAN + VCISR
    # Real-CUGAN:   The original model weight provided by BiliBili (from https://github.com/bilibili/ailab/tree/main)
    # Real-ESRGAN:  Using Anime version RRDB with 6 Blocks (full model has 23 blocks) (from https://github.com/xinntao/Real-ESRGAN/blob/master/docs/model_zoo.md#for-anime-images--illustrations)
    # VCISR:        A model I trained with my upcoming paper methods using Anime training datasets (More details will be released soon!)

    ########################################################### Fundamental Setting #######################################################################################   
    model_name = "Real-ESRGAN"                        # Supported: "Real-CUGAN" (base:2x) || "Real-ESRGAN" (base:4x) || "VCISR" (base:2x)
    inp_path = "input.mp4"            # Intput path (can be a single video file or a folder directory with videos)
    opt_path = "output.mp4"       # Output path after processing video/s of inp_path (PS: If inp_path is a folder, opt_path should also be a folder)
    rescale_factor = 1                          # What rescale for the input frames before doing Super-Resolution [Use this way to take less computation for SR model]
                                                # [default 1 means no rescale] We recommend use some value like 0.5, 0.25 to avoid invalid input size in certain minor cases            
    #######################################################################################################################################################################

    # Auxiliary setting
    decode_fps = 24          # FPS you want the input source be decoded from; If = -1, use original FPS value; I recommend use 24 FPS because Anime are maked from 23.98 (~24) FPS. Thus, some 30 or more FPS anime video is falsely interpolated with unnecessary frames from my perspective. 
    use_tensorrt = True      # Tensorrt increase speed a lot; So, it is highly recommended to install it
    use_rename = False       # Sometimes the video that users download may include unsupported characters, so we rename it if this one is True

    # Multithread and Multiprocessing setting 
    process_num = 2          # The number of fully parallel processed video clips
    full_model_num = 2       # Full frame thread instance number
    nt = 2                   # Partition frame (1/3 part of a frame) instance number 

    # PS:
    #   Reference for my 5600x + 3090Ti setting for Real-CUGAN (almost full power)
    #   **For Real-ESRGAN there is some bugs when nt != 0, I am still analyzing it. To use Real-ESRGAN, we recommend to set nt = 0**
    #   Input Resolution: process_num x (full_model_num + nt)
    # 720P: 3 x (2 + 2)
    # 540P: 3 x (3 + 2)
    # 480P: 3 x (3 + 3)
    ##########################################################################################################################################################################

    ###########################################  General Details Setting  ################################################################
    pixel_padding = 6                                 # This value should be divisible by 6 (Usually, you don't need to change it)  

    # Model name to Architecture name
    _architecture_dict = {
                            "Real-CUGAN": "cunet", 
                            "Real-ESRGAN": "rrdb",
                            "VCISR" : "rrdb",
                         }
    architecture_name = _architecture_dict[model_name]

    # Default weight provided by the model
    _scale_base_dict = {
                            "Real-CUGAN": 2, 
                            "Real-ESRGAN": 4,
                            "VCISR": 2,
                        }
    scale = _scale_base_dict[model_name]   
    scale_base = _scale_base_dict[model_name]
    ######################################################################################################################################

    ########################################  Redundancy Acceleration Setting  ###########################################################
    # This part is used for redundancy acceleration
    MSE_range = 0.2                         # How much Mean Square Error difference between 2 frames you can tolerate (I choose 0.2) (The smaller it is, the better quality it will have)
    Max_Same_Frame = 40                     # How many frames/sub-farmes at most we can jump (40-70 is ok)
    momentum_skip_crop_frame_num = 4        # Use 3 || 4 

    target_saved_portion = 0.2      # This is proposed for 30FPS; with lower FPS setting, it should be lower; however, this is a reference code, usually, 0.09-0.7 is acceptable for the performance
    Queue_hyper_param = 700         #The larger the more queue size allowed and the more cache it will have (higher memory cost, less sleep)

    ######################################################################################################################################

    #########################################  Multi-threading and Video Encoding Setting ######################################################
    # Original Setting: p_sleep = (0.005, 0.012) decode_sleep = 0.001
    p_sleep = (0.005, 0.015)    # Used in Multi-Threading sleep time (empirical value)
    decode_sleep = 0.001        # Used in Video decode

    # Several recommended options for crf (higher means lower quality) and preset (faster means lower quality but less time):
    #   High Qulity:                                    ['-crf', '19', '-preset', 'slow']
    #   Balanced:                                       ['-crf', '23', '-preset', 'medium']
    #   Lower Quality but Smaller size and Faster:      ['-crf', '28', '-preset', 'fast'] 

    # Note1: If you feel that your GPU has unused power (+unsued GPU memory) and CPU is almost occupied:
    #   You should USE the DEFAULT ["-c:v", "hevc_nvenc"], this will increase the speed (hardware encode release CPU pressure and accelerate the speed)
    # Note2: If you want to have a lower data size (lower bitrate and lower bits/pixel):
    #   You can use HEVC(H.265) as the encoder by appending ["-c:v", "libx265"], but the whole processing speed will be lower due to the increased complexity

    encode_params = ['-crf', '23', '-preset', 'medium', "-tune", "animation", "-c:v", "hevc_nvenc"]        
    ######################################################################################################################################

    # TensorRT Weight Generator needed info
    sample_img_dir = "tensorrt_weight_generator/full_sample.png"
    full_croppped_img_dir = "tensorrt_weight_generator/full_croppped_img.png"
    partition_frame_dir = "tensorrt_weight_generator/partition_cropped_img.png"
    weights_dir = "weights/"

    model_full_name = ""
    model_partition_name = ""

C00reNUT commented 9 months ago

the process_num and nt (partition num) are the same, according to config.py but there is error in processing

C00reNUT commented 9 months ago

If I set parameters to

    # Multithread and Multiprocessing setting 
    process_num = 1          # The number of fully parallel processed video clips
    full_model_num = 1       # Full frame thread instance number
    nt = 1                 # Partition frame (1/3 part of a frame) instance number

the processing starts, but the output is never exported and the process never finishes

All Processes Start
Set new attr for inp_path to be tmp/part0.mp4
Set new attr for opt_path to be tmp/part0_res.mp4
This process id is  0
Total FUll Queue size is  700
Total Divided_Block_Queue_size is  700
res_q size is  1400
Full Model Preparation
Real-ESRGAN full : 0
[02/14/2024-13:17:27] [TRT] [W] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage and speed up TensorRT initialization. See "Lazy Loading" section of CUDA documentation https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#lazy-loading
torch2trt full load+prepare time 0.901 s
Partition Model Preparation
Real-ESRGAN partition : 0
[02/14/2024-13:17:27] [TRT] [W] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage and speed up TensorRT initialization. See "Lazy Loading" section of CUDA documentation https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#lazy-loading
torch2trt full load+prepare time 0.083 s
====================================================================================================
Current Processing file is  tmp/part0.mp4
Total frame:147  video decoded frames:0
Total frame:147  video decoded frames:50
Total frame:147  video decoded frames:100
Process 0 had written frames: 0
Process Process-1:
Traceback (most recent call last):
  File "/mnt/a0b764eb-cdc5-4f46-9a2e-e2f11deba631/PYTHON_CACHE/FAST_Anime_VSR/lib/python3.10/site-packages/moviepy/video/io/ffmpeg_writer.py", line 136, in write_frame
    self.proc.stdin.write(img_array.tobytes())
BrokenPipeError: [Errno 32] Broken pipe

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/mnt/a0b764eb-cdc5-4f46-9a2e-e2f11deba631/PYTHON_CACHE/FAST_Anime_VSR/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/mnt/a0b764eb-cdc5-4f46-9a2e-e2f11deba631/PYTHON_CACHE/FAST_Anime_VSR/lib/python3.10/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/mnt/a0b764eb-cdc5-4f46-9a2e-e2f11deba631/Video/SR/FAST_Anime_VSR/process/single_video.py", line 258, in single_process
    report = video_upscaler(configuration.inp_path, configuration.opt_path)
  File "/mnt/a0b764eb-cdc5-4f46-9a2e-e2f11deba631/Video/SR/FAST_Anime_VSR/process/inference.py", line 391, in __call__
    self.frame_write()
  File "/mnt/a0b764eb-cdc5-4f46-9a2e-e2f11deba631/Video/SR/FAST_Anime_VSR/process/inference.py", line 531, in frame_write
    self.writer.write_frame(combined_frame)
  File "/mnt/a0b764eb-cdc5-4f46-9a2e-e2f11deba631/PYTHON_CACHE/FAST_Anime_VSR/lib/python3.10/site-packages/moviepy/video/io/ffmpeg_writer.py", line 180, in write_frame
    raise IOError(error)
OSError: [Errno 32] Broken pipe

MoviePy error: FFMPEG encountered the following error while writing file tmp/part0_res.mp4:

 b"Unknown encoder 'hevc_nvenc'\n"

The video export failed because FFMPEG didn't find the specified codec for video encoding (libx264). Please install this codec or change the codec when calling write_videofile. For instance:
  >>> clip.write_videofile('myvid.webm', codec='libvpx')
Exception in thread Thread-1:
Traceback (most recent call last):
  File "/mnt/a0b764eb-cdc5-4f46-9a2e-e2f11deba631/PYTHON_CACHE/FAST_Anime_VSR/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "/mnt/a0b764eb-cdc5-4f46-9a2e-e2f11deba631/Video/SR/FAST_Anime_VSR/process/inference.py", line 92, in run
    tmp = self.inp_q.get()      # frame_idx (int), position (int, 0|1|2|3), np_frame (numpy)
  File "/mnt/a0b764eb-cdc5-4f46-9a2e-e2f11deba631/PYTHON_CACHE/FAST_Anime_VSR/lib/python3.10/multiprocessing/queues.py", line 103, in get
    res = self._recv_bytes()
  File "/mnt/a0b764eb-cdc5-4f46-9a2e-e2f11deba631/PYTHON_CACHE/FAST_Anime_VSR/lib/python3.10/multiprocessing/connection.py", line 212, in recv_bytes
    self._check_closed()
  File "/mnt/a0b764eb-cdc5-4f46-9a2e-e2f11deba631/PYTHON_CACHE/FAST_Anime_VSR/lib/python3.10/multiprocessing/connection.py", line 136, in _check_closed
    raise OSError("handle is closed")
OSError: handle is closed
UpScalerMT #FULL report:
    Full Exe Cost: 31.54s on 129 frames and in average 0.2445s

C00reNUT commented 9 months ago

I have compiled and linked new version of ffmpeg, to make sure it supports hevc_nvenc but the error is still present

ffmpeg version N-113640-g1e174120d4 Copyright (c) 2000-2024 the FFmpeg developers
  built with gcc 9 (Ubuntu 9.4.0-1ubuntu1~20.04.2)
  configuration: --enable-nonfree --enable-cuda-nvcc --enable-libnpp --extra-cflags=-I/usr/local/cuda/include --extra-ldflags=-L/usr/local/cuda/lib64 --enable-gpl --enable-gnutls --enable-libaom --enable-libass --enable-libfdk-aac --enable-libfreetype --enable-libmp3lame --enable-libopus --enable-libvorbis --enable-libvpx --enable-libx264 --enable-libx265 --enable-nonfree **--enable-nvenc** --enable-libass --disable-debug --enable-libvorbis --enable-libvpx --enable-opencl --enable-gpl --cpu=native --enable-libfdk-aac --enable-libx264 --enable-libx265 --enable-librtmp

mybolide commented 3 hours ago

There was a problem with x, I modified their function, and I also encountered "OSError: handle is closed". The version of ffmpeg installed was also correct. Finally, I turned off the cuda configuration in confy.py, specifically line 95, where hevc_nvenc was changed to libx264

Kiteretsu77 / FAST_Anime_VSR

ValueError: We need to ensure that the partition num is equals to the parallel num #16