Open C00reNUT opened 9 months ago
this is the config,
import os
os.environ["CUDA_VISIBLE_DEVICES"]="0" # GPU device for inference
class configuration:
def __init__(self):
pass
###################################################### Frequently Edited Setting ########################################################################################
# Model exaplain: 3 models we current support: Real-CUGAN + Real-ESRGAN + VCISR
# Real-CUGAN: The original model weight provided by BiliBili (from https://github.com/bilibili/ailab/tree/main)
# Real-ESRGAN: Using Anime version RRDB with 6 Blocks (full model has 23 blocks) (from https://github.com/xinntao/Real-ESRGAN/blob/master/docs/model_zoo.md#for-anime-images--illustrations)
# VCISR: A model I trained with my upcoming paper methods using Anime training datasets (More details will be released soon!)
########################################################### Fundamental Setting #######################################################################################
model_name = "Real-ESRGAN" # Supported: "Real-CUGAN" (base:2x) || "Real-ESRGAN" (base:4x) || "VCISR" (base:2x)
inp_path = "input.mp4" # Intput path (can be a single video file or a folder directory with videos)
opt_path = "output.mp4" # Output path after processing video/s of inp_path (PS: If inp_path is a folder, opt_path should also be a folder)
rescale_factor = 1 # What rescale for the input frames before doing Super-Resolution [Use this way to take less computation for SR model]
# [default 1 means no rescale] We recommend use some value like 0.5, 0.25 to avoid invalid input size in certain minor cases
#######################################################################################################################################################################
# Auxiliary setting
decode_fps = 24 # FPS you want the input source be decoded from; If = -1, use original FPS value; I recommend use 24 FPS because Anime are maked from 23.98 (~24) FPS. Thus, some 30 or more FPS anime video is falsely interpolated with unnecessary frames from my perspective.
use_tensorrt = True # Tensorrt increase speed a lot; So, it is highly recommended to install it
use_rename = False # Sometimes the video that users download may include unsupported characters, so we rename it if this one is True
# Multithread and Multiprocessing setting
process_num = 2 # The number of fully parallel processed video clips
full_model_num = 2 # Full frame thread instance number
nt = 2 # Partition frame (1/3 part of a frame) instance number
# PS:
# Reference for my 5600x + 3090Ti setting for Real-CUGAN (almost full power)
# **For Real-ESRGAN there is some bugs when nt != 0, I am still analyzing it. To use Real-ESRGAN, we recommend to set nt = 0**
# Input Resolution: process_num x (full_model_num + nt)
# 720P: 3 x (2 + 2)
# 540P: 3 x (3 + 2)
# 480P: 3 x (3 + 3)
##########################################################################################################################################################################
########################################### General Details Setting ################################################################
pixel_padding = 6 # This value should be divisible by 6 (Usually, you don't need to change it)
# Model name to Architecture name
_architecture_dict = {
"Real-CUGAN": "cunet",
"Real-ESRGAN": "rrdb",
"VCISR" : "rrdb",
}
architecture_name = _architecture_dict[model_name]
# Default weight provided by the model
_scale_base_dict = {
"Real-CUGAN": 2,
"Real-ESRGAN": 4,
"VCISR": 2,
}
scale = _scale_base_dict[model_name]
scale_base = _scale_base_dict[model_name]
######################################################################################################################################
######################################## Redundancy Acceleration Setting ###########################################################
# This part is used for redundancy acceleration
MSE_range = 0.2 # How much Mean Square Error difference between 2 frames you can tolerate (I choose 0.2) (The smaller it is, the better quality it will have)
Max_Same_Frame = 40 # How many frames/sub-farmes at most we can jump (40-70 is ok)
momentum_skip_crop_frame_num = 4 # Use 3 || 4
target_saved_portion = 0.2 # This is proposed for 30FPS; with lower FPS setting, it should be lower; however, this is a reference code, usually, 0.09-0.7 is acceptable for the performance
Queue_hyper_param = 700 #The larger the more queue size allowed and the more cache it will have (higher memory cost, less sleep)
######################################################################################################################################
######################################### Multi-threading and Video Encoding Setting ######################################################
# Original Setting: p_sleep = (0.005, 0.012) decode_sleep = 0.001
p_sleep = (0.005, 0.015) # Used in Multi-Threading sleep time (empirical value)
decode_sleep = 0.001 # Used in Video decode
# Several recommended options for crf (higher means lower quality) and preset (faster means lower quality but less time):
# High Qulity: ['-crf', '19', '-preset', 'slow']
# Balanced: ['-crf', '23', '-preset', 'medium']
# Lower Quality but Smaller size and Faster: ['-crf', '28', '-preset', 'fast']
# Note1: If you feel that your GPU has unused power (+unsued GPU memory) and CPU is almost occupied:
# You should USE the DEFAULT ["-c:v", "hevc_nvenc"], this will increase the speed (hardware encode release CPU pressure and accelerate the speed)
# Note2: If you want to have a lower data size (lower bitrate and lower bits/pixel):
# You can use HEVC(H.265) as the encoder by appending ["-c:v", "libx265"], but the whole processing speed will be lower due to the increased complexity
encode_params = ['-crf', '23', '-preset', 'medium', "-tune", "animation", "-c:v", "hevc_nvenc"]
######################################################################################################################################
# TensorRT Weight Generator needed info
sample_img_dir = "tensorrt_weight_generator/full_sample.png"
full_croppped_img_dir = "tensorrt_weight_generator/full_croppped_img.png"
partition_frame_dir = "tensorrt_weight_generator/partition_cropped_img.png"
weights_dir = "weights/"
model_full_name = ""
model_partition_name = ""
the process_num and nt (partition num) are the same, according to config.py but there is error in processing
If I set parameters to
# Multithread and Multiprocessing setting
process_num = 1 # The number of fully parallel processed video clips
full_model_num = 1 # Full frame thread instance number
nt = 1 # Partition frame (1/3 part of a frame) instance number
the processing starts, but the output is never exported and the process never finishes
All Processes Start
Set new attr for inp_path to be tmp/part0.mp4
Set new attr for opt_path to be tmp/part0_res.mp4
This process id is 0
Total FUll Queue size is 700
Total Divided_Block_Queue_size is 700
res_q size is 1400
Full Model Preparation
Real-ESRGAN full : 0
[02/14/2024-13:17:27] [TRT] [W] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage and speed up TensorRT initialization. See "Lazy Loading" section of CUDA documentation https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#lazy-loading
torch2trt full load+prepare time 0.901 s
Partition Model Preparation
Real-ESRGAN partition : 0
[02/14/2024-13:17:27] [TRT] [W] CUDA lazy loading is not enabled. Enabling it can significantly reduce device memory usage and speed up TensorRT initialization. See "Lazy Loading" section of CUDA documentation https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#lazy-loading
torch2trt full load+prepare time 0.083 s
====================================================================================================
Current Processing file is tmp/part0.mp4
Total frame:147 video decoded frames:0
Total frame:147 video decoded frames:50
Total frame:147 video decoded frames:100
Process 0 had written frames: 0
Process Process-1:
Traceback (most recent call last):
File "/mnt/a0b764eb-cdc5-4f46-9a2e-e2f11deba631/PYTHON_CACHE/FAST_Anime_VSR/lib/python3.10/site-packages/moviepy/video/io/ffmpeg_writer.py", line 136, in write_frame
self.proc.stdin.write(img_array.tobytes())
BrokenPipeError: [Errno 32] Broken pipe
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/mnt/a0b764eb-cdc5-4f46-9a2e-e2f11deba631/PYTHON_CACHE/FAST_Anime_VSR/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
self.run()
File "/mnt/a0b764eb-cdc5-4f46-9a2e-e2f11deba631/PYTHON_CACHE/FAST_Anime_VSR/lib/python3.10/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/mnt/a0b764eb-cdc5-4f46-9a2e-e2f11deba631/Video/SR/FAST_Anime_VSR/process/single_video.py", line 258, in single_process
report = video_upscaler(configuration.inp_path, configuration.opt_path)
File "/mnt/a0b764eb-cdc5-4f46-9a2e-e2f11deba631/Video/SR/FAST_Anime_VSR/process/inference.py", line 391, in __call__
self.frame_write()
File "/mnt/a0b764eb-cdc5-4f46-9a2e-e2f11deba631/Video/SR/FAST_Anime_VSR/process/inference.py", line 531, in frame_write
self.writer.write_frame(combined_frame)
File "/mnt/a0b764eb-cdc5-4f46-9a2e-e2f11deba631/PYTHON_CACHE/FAST_Anime_VSR/lib/python3.10/site-packages/moviepy/video/io/ffmpeg_writer.py", line 180, in write_frame
raise IOError(error)
OSError: [Errno 32] Broken pipe
MoviePy error: FFMPEG encountered the following error while writing file tmp/part0_res.mp4:
b"Unknown encoder 'hevc_nvenc'\n"
The video export failed because FFMPEG didn't find the specified codec for video encoding (libx264). Please install this codec or change the codec when calling write_videofile. For instance:
>>> clip.write_videofile('myvid.webm', codec='libvpx')
Exception in thread Thread-1:
Traceback (most recent call last):
File "/mnt/a0b764eb-cdc5-4f46-9a2e-e2f11deba631/PYTHON_CACHE/FAST_Anime_VSR/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
self.run()
File "/mnt/a0b764eb-cdc5-4f46-9a2e-e2f11deba631/Video/SR/FAST_Anime_VSR/process/inference.py", line 92, in run
tmp = self.inp_q.get() # frame_idx (int), position (int, 0|1|2|3), np_frame (numpy)
File "/mnt/a0b764eb-cdc5-4f46-9a2e-e2f11deba631/PYTHON_CACHE/FAST_Anime_VSR/lib/python3.10/multiprocessing/queues.py", line 103, in get
res = self._recv_bytes()
File "/mnt/a0b764eb-cdc5-4f46-9a2e-e2f11deba631/PYTHON_CACHE/FAST_Anime_VSR/lib/python3.10/multiprocessing/connection.py", line 212, in recv_bytes
self._check_closed()
File "/mnt/a0b764eb-cdc5-4f46-9a2e-e2f11deba631/PYTHON_CACHE/FAST_Anime_VSR/lib/python3.10/multiprocessing/connection.py", line 136, in _check_closed
raise OSError("handle is closed")
OSError: handle is closed
UpScalerMT #FULL report:
Full Exe Cost: 31.54s on 129 frames and in average 0.2445s
I have compiled and linked new version of ffmpeg, to make sure it supports hevc_nvenc but the error is still present
ffmpeg version N-113640-g1e174120d4 Copyright (c) 2000-2024 the FFmpeg developers
built with gcc 9 (Ubuntu 9.4.0-1ubuntu1~20.04.2)
configuration: --enable-nonfree --enable-cuda-nvcc --enable-libnpp --extra-cflags=-I/usr/local/cuda/include --extra-ldflags=-L/usr/local/cuda/lib64 --enable-gpl --enable-gnutls --enable-libaom --enable-libass --enable-libfdk-aac --enable-libfreetype --enable-libmp3lame --enable-libopus --enable-libvorbis --enable-libvpx --enable-libx264 --enable-libx265 --enable-nonfree **--enable-nvenc** --enable-libass --disable-debug --enable-libvorbis --enable-libvpx --enable-opencl --enable-gpl --cpu=native --enable-libfdk-aac --enable-libx264 --enable-libx265 --enable-librtmp
There was a problem with x, I modified their function, and I also encountered "OSError: handle is closed". The version of ffmpeg installed was also correct. Finally, I turned off the cuda configuration in confy.py, specifically line 95, where hevc_nvenc was changed to libx264
Hello, I am trying to upscale 1280 × 768 video (H.264 codec), using Real-ESRGAN and I am getting following error: