tfaehse / DashcamCleaner

Censor identifiable information in videos, in particular dashcam recordings in Germany.
GNU Affero General Public License v3.0
130 stars 27 forks source link

How to use GPU on Desktop Version? #77

Closed nils8107 closed 6 months ago

nils8107 commented 1 year ago

Hello I use the GUI. When i render a Video it only uses my CPU (i713700k) My RTX4090 is idle. How can I use the GPU?

Thank you

Dave04O4 commented 1 year ago

Da würde ich mich mit anschließen.

Ich habe es mit den älteren Versionen immer hingekriegt, aber jetzt nicht mehr.

Maxolus commented 1 year ago

Hey, had the same problem today. For me, the requirements.txt file installed a pytorch version without GPU support.

Solution that worked for me (install correct PyTorch version):

  1. uninstall torch, torchaudio and torchvision with pip.
  2. generate pip command for your settings (conda version can be found with nvidia-smi command) on pytorch dot org with correct settings and run in the conda environment.

Cheers

nils8107 commented 1 year ago

Hello Maxolus

Thank your for your solution. As I understand:

Part 1: pip uninstall torch pip uninstall torchaudio pip uninstall torchvision

Part 2: When i run "nvidia-smi command" i get this: NVIDIA-SMI 536.67 Driver Version: 536.67 CUDA Version: 12.2

On pytorch dot org I only have those Options: CUDA 11.7 or CUDA11.8

I installed now: pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

Now CPU is 12% and GPU is with Spikes also busy now. But far from high usage... Is that normal? But its faster now than before with 100%CPU.

And Please someone explain:

-Feather edges (Set to 1) -Batch Size (Set to 16) -Blur Workers (Now is set to 1) what this Value do?

pytorch

Any advice please. Thank you

Maxolus commented 1 year ago

Hey there,

I also use CUDA 12.2 (RTX 4070 ti) and the stable 2.0.1 PyTorch version with CUDA 11.8 support :). So in this case I would choose Stable - Windows - Pip - Python - CUDA 11.8.

If you haven't uninstalled PyTorch yet, you can also make sure the CPU has installed the current version by mistake by running: pip list if torch 2.0.1+cpu is listed, reinstalling the GPU version should fix it.

edit just realized that it should have worked already. The GPU spikes and CPU utilization also occur for me as long as its faster, its fine :).

I can only guess what the terms in the UI mean (not a contributor for this projekt) but basically:

Greetings Maxolus

nils8107 commented 1 year ago

Thank you Maxolos

I have choosen "Stable - Windows - Pip - Python - CUDA 11.8." Screenshot was a bit older...

Only "torch 2.0.1+cu118" is listet.

I see now now its running faster than before and I`m quite happy now.

I made now Batch Size 26 and Blur Worker 4 and it runs even faster. Also more Memory on GPU used. I need play around a bit. I wan`t to use my fast PC to the Max speed...

You helped me a lot

fragy007 commented 9 months ago

Hello Maxolus

Thank your for your solution. As I understand:

Part 1: pip uninstall torch pip uninstall torchaudio pip uninstall torchvision

Part 2: When i run "nvidia-smi command" i get this: NVIDIA-SMI 536.67 Driver Version: 536.67 CUDA Version: 12.2

On pytorch dot org I only have those Options: CUDA 11.7 or CUDA11.8

I installed now: pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

Now CPU is 12% and GPU is with Spikes also busy now. But far from high usage... Is that normal? But its faster now than before with 100%CPU.

And Please someone explain:

-Feather edges (Set to 1) -Batch Size (Set to 16) -Blur Workers (Now is set to 1) what this Value do?

pytorch

Any advice please. Thank you

Thanks a lot! These steps where helpful! RTX3090 is now running fine instead CPU only :)

tfaehse commented 9 months ago

Happy to see that things worked out!

-Feather edges (Set to 1) -Batch Size (Set to 16) -Blur Workers (Now is set to 1) what this Value do?

Just for the record, since it's fairly important: the tool does a three stage process for the video, in "batches":

  1. Read x frames (batch size)
  2. Pass x frames into the detector (batch size)
  3. Blur x frames (in y == blur workers processes)
  4. Write x frames

Effectively, you want the batch size to be as large as possible, before saturating your VRAM. Blur workers should be the same or smaller, depending on how much normal RAM you have available. Interestingly enough, the detector (with GPU support) is often not even the slowest part of this chain, but the blurring/writing is. In the future, I'll try to split these steps up a bit (detect first, then blur, fully sequentially). This will also allow me to run tracking of boxes etc.