deepfakes / faceswap

Deepfakes Software For All
https://www.faceswap.dev
GNU General Public License v3.0
52.54k stars 13.24k forks source link

CLI execution cannot load dependencies, GPU is not used #1259

Open HG4554 opened 2 years ago

HG4554 commented 2 years ago

Describe the bug FaceSwap GUI generates CLI commands that cannot utilize the GPU due to Tensorflow dependency errors. The Conda env has to be "activated" for all dependencies to load correctly, it is no longer sufficient to directly invoke Python by it's path in the Faceswap env. This did work in the past.

I believe this is the underlying cause of other issues like #1244, hat tip to Replican for posting the solution in the forum.

To Reproduce Steps to reproduce the behavior:

  1. Configure the Extract tool and run from the Faceswap GUI, it will run normally and utilize the GPU.
  2. Click Generate and run the CLI equivalent in a terminal. It will print dependency errors and run on CPU only.

Expected behavior If GPU works in GUI, it should work in CLI.

Desktop (please complete the following information):

This is the current invocation that fails to use GPU:

ec2-user@ip-172-30-1-104 /m/data > /mnt/data/homedir/faceswap8/miniconda3/envs/faceswap/bin/python /mnt/data/homedir/faceswap8/faceswap/faceswap.py e
xtract -i /mnt/data/sourcedata/test/test.mp4 -o /mnt/data/sourcedata/test/output -al /mnt/data/sourcedata/test/test.fsa -D s3fd -A fan -nm none -rf 0 -min 0 -
l 0.4 -sz 512 -een 1 -si 0 -ssf -L VERBOSE     
...
2022-08-18 17:32:51.796104: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudnn.so.8'; dlerror: libcudnn.
so.8: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /mnt/data/homedir/faceswap8/miniconda3/envs/faceswap/lib/python3.9/site-pack
ages/cv2/../../lib64:
2022-08-18 17:32:51.796132: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1850] Cannot dlopen some GPU libraries. Please make sure the missing libraries 
mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup 
the required libraries for your platform.
Skipping registering GPU devices...

This invocation activates the Conda env first, then runs the same CLI command which does utilize the GPU:

ec2-user@ip-172-30-1-104:/mnt/data/homedir/faceswap8$ source miniconda3/etc/profile.d/conda.sh activate
ec2-user@ip-172-30-1-104:/mnt/data/homedir/faceswap8$ conda activate faceswap
(faceswap) ec2-user@ip-172-30-1-104:/mnt/data/homedir/faceswap8$ which python
/mnt/data/homedir/faceswap8/miniconda3/envs/faceswap/bin/python                
(faceswap) ec2-user@ip-172-30-1-104:/mnt/data/homedir/faceswap8$ python /mnt/data/homedir/faceswap8/faceswap/faceswap.py extract -i /mnt/data/sourcedat
a/test/test.mp4 -o /mnt/data/sourcedata/test/output -al /mnt/data/sourcedata/test/test.fsa -D s3fd -A fan -nm none -rf 0 -min 0 -l 0.4 -sz 512 -een 1 -si 0 -s
sf -L VERBOSE                          
...
2022-08-18 17:51:57.393937: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 13650 MB memory:  -> device: 0, name: Tesla T4, pci bus id: 0000:00:1e.0, compute capability: 7.5
torzdf commented 2 years ago

I'm trying to think of the use-case for this... it may be that the documentation needs to be updated. I haven't tested the above, but then I have never attempted to run faceswap from outside of the Conda environment.

It probably helps that I wrote the setup script, so know that it adds LD_LIBRARY_PATH to the activate script: https://github.com/deepfakes/faceswap/blob/ee25a31d33d6e443d519e6459de8adb78616a5bd/setup.py#L345

Ultimately, I'm surprised that just executing the python binary worked in the past, as the Conda Env holds more than just a python virtual environment (required Cuda/cuDNN binaries and the like).

Generating the cli arguments from the GUI is just a convenience function. Adapting it to test for Conda environments and the like is most likely a bit out of scope.

I'm not sure if it's the same as the Docker issue (I do not use Docker), as theoretically Docker has it's own globally installed Cuda/cuDNN which should be being utilized.