xinntao / Real-ESRGAN

Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration.
BSD 3-Clause "New" or "Revised" License
28.35k stars 3.56k forks source link

Does discrete multi-gpu work with video inference? #639

Open cbroker1 opened 1 year ago

cbroker1 commented 1 year ago

My GPUs:

$ nvidia-smi Tue Jun 6 22:52:48 2023 +---------------------------------------------------------------------------------------+ | NVIDIA-SMI 531.68 Driver Version: 531.68 CUDA Version: 12.1 | |-----------------------------------------+----------------------+----------------------+ | GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+======================+======================| | 0 NVIDIA GeForce RTX 4090 WDDM | 00000000:01:00.0 On | Off | | 30% 35C P0 48W / 450W| 1727MiB / 24564MiB | 1% Default | | | | N/A | +-----------------------------------------+----------------------+----------------------+ | 1 NVIDIA GeForce RTX 3080 WDDM | 00000000:48:00.0 Off | N/A | | 0% 33C P8 12W / 370W| 0MiB / 10240MiB | 0% Default | | | | N/A | +-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+

My console call: python inference_realesrgan_video.py -i bike_4k_.mp4 -o out_bike_4k_realesrganx4plus_outscale4_fe -n RealESRGAN_x4plus --face_enhance --fp32 --outscale 2 --tile 1024 --gpuid 0,1

I've tried several permutations to call the gpus, '-g, --g, -gpu-id, --gpu-id, etc, etc`. Am I missing something?

cbroker1 commented 1 year ago

Update:

It....works?

python inference_realesrgan_video.py -i first_edit.mp4 -o out_first_edit_realesrganx4plus_outscale6_fe -n RealESRGAN_x4plus --face_enhance --fp32 --outscale 6 --tile 1024

4090 appears to be running fine

image

but 3080 has 0% encoding?

image

What's interesting is that my runtime, w/o 3080 is ~4 hrs, but now it's ~2?

image

If the 3080 is actually running, it doesn't have the same compute as a 4090, meaning my total compute time should not have went from ~4 hrs to 2 hrs.

What's going on here? I'm so curious.

cbroker1 commented 1 year ago

Update, Update:

Interestingly, when exploring things further when I reduce tile from 1024 to 512 I appear to get a response? Note spikes in 3080 Video Encoding, as well as similar gpu temps.

image