deepfakes / faceswap

Deepfakes Software For All
https://www.faceswap.dev
GNU General Public License v3.0
52.54k stars 13.24k forks source link

Error at training start because of multi GPU bad detection #948

Closed amallecourt closed 4 years ago

amallecourt commented 4 years ago

Bug

When I start the train script, I encounter the following error : "To call multi_gpu_model with gpus=2, we expect the following devices to be available: ['/cpu:0', '/gpu:0', '/gpu:1']. However this machine only has: ['/cpu:0']. Try reducing gpus."

I give the argument to use 2 GPUs (-g 2), because I effectively have two GPUs in my computer configuration, as you can verify in the bug report : gpu_devices: GPU_0: GeForce RTX 2080 Ti, GPU_1: GeForce RTX 2080 Ti gpu_devices_active: GPU_0, GPU_1

So GPUs are detected by the computer, but not by the script. Do you have an idea of the problem ?

(I should specify that I don't use Conda. And cuDNN don't seems to be detected while I did the installation in accordance with Nvidia's guide. But not sure it would have an impact on this particular error, would it ?)

Crash Report

12/06/2019 18:53:00 MainProcess training_0 nn_blocks init DEBUG Initializing NNBlocks: (use_subpixel: False, use_icnr_init: False, use_convaware_init: False, use_reflect_padding: False, first_run: True) 12/06/2019 18:53:00 MainProcess training_0 nn_blocks init DEBUG Initialized NNBlocks 12/06/2019 18:53:00 MainProcess training_0 _base name DEBUG model name: 'original' 12/06/2019 18:53:00 MainProcess training_0 _base rename_legacy DEBUG Renaming legacy files 12/06/2019 18:53:00 MainProcess training_0 _base name DEBUG model name: 'original' 12/06/2019 18:53:00 MainProcess training_0 _base rename_legacy DEBUG No legacy files to rename 12/06/2019 18:53:00 MainProcess training_0 _base load_state_info DEBUG Loading Input Shape from State file 12/06/2019 18:53:00 MainProcess training_0 _base load_state_info DEBUG No input shapes saved. Using model config 12/06/2019 18:53:00 MainProcess training_0 _base multiple_models_in_folder DEBUG model_files: [], retval: False 12/06/2019 18:53:00 MainProcess training_0 original add_networks DEBUG Adding networks 12/06/2019 18:53:01 MainProcess training_0 nn_blocks upscale DEBUG inp: Tensor("input_1:0", shape=(?, 8, 8, 512), dtype=float32), filters: 256, kernel_size: 3, use_instance_norm: False, kwargs: {}) 12/06/2019 18:53:01 MainProcess training_0 nn_blocks get_name DEBUG Generating block name: upscale_8_0 12/06/2019 18:53:01 MainProcess training_0 nn_blocks set_default_initializer DEBUG Set default kernel_initializer to: <keras.initializers.VarianceScaling object at 0x7f37e87c30f0> 12/06/2019 18:53:01 MainProcess training_0 nn_blocks conv2d DEBUG inp: Tensor("input_1:0", shape=(?, 8, 8, 512), dtype=float32), filters: 1024, kernel_size: 3, strides: (1, 1), padding: same, kwargs: {'name': 'upscale_8_0_conv2d', 'kernel_initializer': <keras.initializers.VarianceScaling object at 0x7f37e87c30f0>}) 12/06/2019 18:53:01 MainProcess training_0 nn_blocks set_default_initializer DEBUG Using model specified initializer: <keras.initializers.VarianceScaling object at 0x7f37e87c30f0> 12/06/2019 18:53:01 MainProcess training_0 nn_blocks upscale DEBUG inp: Tensor("upscale_8_0_pixelshuffler/Reshape_1:0", shape=(?, 16, 16, 256), dtype=float32), filters: 128, kernel_size: 3, use_instance_norm: False, kwargs: {}) 12/06/2019 18:53:01 MainProcess training_0 nn_blocks get_name DEBUG Generating block name: upscale_16_0 12/06/2019 18:53:01 MainProcess training_0 nn_blocks set_default_initializer DEBUG Set default kernel_initializer to: <keras.initializers.VarianceScaling object at 0x7f37e8764ba8> 12/06/2019 18:53:01 MainProcess training_0 nn_blocks conv2d DEBUG inp: Tensor("upscale_8_0_pixelshuffler/Reshape_1:0", shape=(?, 16, 16, 256), dtype=float32), filters: 512, kernel_size: 3, strides: (1, 1), padding: same, kwargs: {'name': 'upscale_16_0_conv2d', 'kernel_initializer': <keras.initializers.VarianceScaling object at 0x7f37e8764ba8>}) 12/06/2019 18:53:01 MainProcess training_0 nn_blocks set_default_initializer DEBUG Using model specified initializer: <keras.initializers.VarianceScaling object at 0x7f37e8764ba8> 12/06/2019 18:53:01 MainProcess training_0 nn_blocks upscale DEBUG inp: Tensor("upscale_16_0_pixelshuffler/Reshape_1:0", shape=(?, 32, 32, 128), dtype=float32), filters: 64, kernel_size: 3, use_instance_norm: False, kwargs: {}) 12/06/2019 18:53:01 MainProcess training_0 nn_blocks get_name DEBUG Generating block name: upscale_32_0 12/06/2019 18:53:01 MainProcess training_0 nn_blocks set_default_initializer DEBUG Set default kernel_initializer to: <keras.initializers.VarianceScaling object at 0x7f37e8770f60> 12/06/2019 18:53:01 MainProcess training_0 nn_blocks conv2d DEBUG inp: Tensor("upscale_16_0_pixelshuffler/Reshape_1:0", shape=(?, 32, 32, 128), dtype=float32), filters: 256, kernel_size: 3, strides: (1, 1), padding: same, kwargs: {'name': 'upscale_32_0_conv2d', 'kernel_initializer': <keras.initializers.VarianceScaling object at 0x7f37e8770f60>}) 12/06/2019 18:53:01 MainProcess training_0 nn_blocks set_default_initializer DEBUG Using model specified initializer: <keras.initializers.VarianceScaling object at 0x7f37e8770f60> 12/06/2019 18:53:01 MainProcess training_0 nn_blocks conv2d DEBUG inp: Tensor("upscale_32_0_pixelshuffler/Reshape_1:0", shape=(?, 64, 64, 64), dtype=float32), filters: 3, kernel_size: 5, strides: (1, 1), padding: same, kwargs: {'activation': 'sigmoid', 'name': 'face_out'}) 12/06/2019 18:53:01 MainProcess training_0 nn_blocks set_default_initializer DEBUG Set default kernel_initializer to: <keras.initializers.VarianceScaling object at 0x7f37e8748b38> 12/06/2019 18:53:01 MainProcess training_0 _base add_network DEBUG network_type: 'decoder', side: 'a', network: '<keras.engine.training.Model object at 0x7f37e86ec710>', is_output: True 12/06/2019 18:53:01 MainProcess training_0 _base name DEBUG model name: 'original' 12/06/2019 18:53:01 MainProcess training_0 _base add_network DEBUG name: 'decoder_a', filename: 'original_decoder_A.h5' 12/06/2019 18:53:01 MainProcess training_0 _base init DEBUG Initializing NNMeta: (filename: '/home/tf1/Documents/Deepfakes/faceswap/model/oss_adrien/original_decoder_A.h5', network_type: 'decoder', side: 'a', network: <keras.engine.training.Model object at 0x7f37e86ec710>, is_output: True 12/06/2019 18:53:06 MainProcess training_0 _base init DEBUG Initialized NNMeta 12/06/2019 18:53:06 MainProcess training_0 nn_blocks upscale DEBUG inp: Tensor("input_2:0", shape=(?, 8, 8, 512), dtype=float32), filters: 256, kernel_size: 3, use_instance_norm: False, kwargs: {}) 12/06/2019 18:53:06 MainProcess training_0 nn_blocks get_name DEBUG Generating block name: upscale_8_1 12/06/2019 18:53:06 MainProcess training_0 nn_blocks set_default_initializer DEBUG Set default kernel_initializer to: <keras.initializers.VarianceScaling object at 0x7f387db8a438> 12/06/2019 18:53:06 MainProcess training_0 nn_blocks conv2d DEBUG inp: Tensor("input_2:0", shape=(?, 8, 8, 512), dtype=float32), filters: 1024, kernel_size: 3, strides: (1, 1), padding: same, kwargs: {'name': 'upscale_8_1_conv2d', 'kernel_initializer': <keras.initializers.VarianceScaling object at 0x7f387db8a438>}) 12/06/2019 18:53:06 MainProcess training_0 nn_blocks set_default_initializer DEBUG Using model specified initializer: <keras.initializers.VarianceScaling object at 0x7f387db8a438> 12/06/2019 18:53:06 MainProcess training_0 nn_blocks upscale DEBUG inp: Tensor("upscale_8_1_pixelshuffler/Reshape_1:0", shape=(?, 16, 16, 256), dtype=float32), filters: 128, kernel_size: 3, use_instance_norm: False, kwargs: {}) 12/06/2019 18:53:06 MainProcess training_0 nn_blocks get_name DEBUG Generating block name: upscale_16_1 12/06/2019 18:53:06 MainProcess training_0 nn_blocks set_default_initializer DEBUG Set default kernel_initializer to: <keras.initializers.VarianceScaling object at 0x7f37e86a54e0> 12/06/2019 18:53:06 MainProcess training_0 nn_blocks conv2d DEBUG inp: Tensor("upscale_8_1_pixelshuffler/Reshape_1:0", shape=(?, 16, 16, 256), dtype=float32), filters: 512, kernel_size: 3, strides: (1, 1), padding: same, kwargs: {'name': 'upscale_16_1_conv2d', 'kernel_initializer': <keras.initializers.VarianceScaling object at 0x7f37e86a54e0>}) 12/06/2019 18:53:06 MainProcess training_0 nn_blocks set_default_initializer DEBUG Using model specified initializer: <keras.initializers.VarianceScaling object at 0x7f37e86a54e0> 12/06/2019 18:53:06 MainProcess training_0 nn_blocks upscale DEBUG inp: Tensor("upscale_16_1_pixelshuffler/Reshape_1:0", shape=(?, 32, 32, 128), dtype=float32), filters: 64, kernel_size: 3, use_instance_norm: False, kwargs: {}) 12/06/2019 18:53:06 MainProcess training_0 nn_blocks get_name DEBUG Generating block name: upscale_32_1 12/06/2019 18:53:06 MainProcess training_0 nn_blocks set_default_initializer DEBUG Set default kernel_initializer to: <keras.initializers.VarianceScaling object at 0x7f37e86ced68> 12/06/2019 18:53:06 MainProcess training_0 nn_blocks conv2d DEBUG inp: Tensor("upscale_16_1_pixelshuffler/Reshape_1:0", shape=(?, 32, 32, 128), dtype=float32), filters: 256, kernel_size: 3, strides: (1, 1), padding: same, kwargs: {'name': 'upscale_32_1_conv2d', 'kernel_initializer': <keras.initializers.VarianceScaling object at 0x7f37e86ced68>}) 12/06/2019 18:53:06 MainProcess training_0 nn_blocks set_default_initializer DEBUG Using model specified initializer: <keras.initializers.VarianceScaling object at 0x7f37e86ced68> 12/06/2019 18:53:06 MainProcess training_0 nn_blocks conv2d DEBUG inp: Tensor("upscale_32_1_pixelshuffler/Reshape_1:0", shape=(?, 64, 64, 64), dtype=float32), filters: 3, kernel_size: 5, strides: (1, 1), padding: same, kwargs: {'activation': 'sigmoid', 'name': 'face_out'}) 12/06/2019 18:53:06 MainProcess training_0 nn_blocks set_default_initializer DEBUG Set default kernel_initializer to: <keras.initializers.VarianceScaling object at 0x7f37e8025e10> 12/06/2019 18:53:06 MainProcess training_0 _base add_network DEBUG network_type: 'decoder', side: 'b', network: '<keras.engine.training.Model object at 0x7f37e803b438>', is_output: True 12/06/2019 18:53:06 MainProcess training_0 _base name DEBUG model name: 'original' 12/06/2019 18:53:06 MainProcess training_0 _base add_network DEBUG name: 'decoder_b', filename: 'original_decoder_B.h5' 12/06/2019 18:53:06 MainProcess training_0 _base init DEBUG Initializing NNMeta: (filename: '/home/tf1/Documents/Deepfakes/faceswap/model/oss_adrien/original_decoder_B.h5', network_type: 'decoder', side: 'b', network: <keras.engine.training.Model object at 0x7f37e803b438>, is_output: True 12/06/2019 18:53:06 MainProcess training_0 _base init DEBUG Initialized NNMeta 12/06/2019 18:53:06 MainProcess training_0 nn_blocks conv DEBUG inp: Tensor("input_3:0", shape=(?, 64, 64, 3), dtype=float32), filters: 128, kernel_size: 5, strides: 2, use_instance_norm: False, kwargs: {}) 12/06/2019 18:53:06 MainProcess training_0 nn_blocks get_name DEBUG Generating block name: conv_64_0 12/06/2019 18:53:06 MainProcess training_0 nn_blocks conv2d DEBUG inp: Tensor("input_3:0", shape=(?, 64, 64, 3), dtype=float32), filters: 128, kernel_size: 5, strides: 2, padding: same, kwargs: {'name': 'conv_64_0_conv2d'}) 12/06/2019 18:53:06 MainProcess training_0 nn_blocks set_default_initializer DEBUG Set default kernel_initializer to: <keras.initializers.VarianceScaling object at 0x7f37e07e6da0> 12/06/2019 18:53:06 MainProcess training_0 nn_blocks conv DEBUG inp: Tensor("conv_64_0_leakyrelu/LeakyRelu:0", shape=(?, 32, 32, 128), dtype=float32), filters: 256, kernel_size: 5, strides: 2, use_instance_norm: False, kwargs: {}) 12/06/2019 18:53:06 MainProcess training_0 nn_blocks get_name DEBUG Generating block name: conv_32_0 12/06/2019 18:53:06 MainProcess training_0 nn_blocks conv2d DEBUG inp: Tensor("conv_64_0_leakyrelu/LeakyRelu:0", shape=(?, 32, 32, 128), dtype=float32), filters: 256, kernel_size: 5, strides: 2, padding: same, kwargs: {'name': 'conv_32_0_conv2d'}) 12/06/2019 18:53:06 MainProcess training_0 nn_blocks set_default_initializer DEBUG Set default kernel_initializer to: <keras.initializers.VarianceScaling object at 0x7f37e077fa20> 12/06/2019 18:53:06 MainProcess training_0 nn_blocks conv DEBUG inp: Tensor("conv_32_0_leakyrelu/LeakyRelu:0", shape=(?, 16, 16, 256), dtype=float32), filters: 512, kernel_size: 5, strides: 2, use_instance_norm: False, kwargs: {}) 12/06/2019 18:53:06 MainProcess training_0 nn_blocks get_name DEBUG Generating block name: conv_16_0 12/06/2019 18:53:06 MainProcess training_0 nn_blocks conv2d DEBUG inp: Tensor("conv_32_0_leakyrelu/LeakyRelu:0", shape=(?, 16, 16, 256), dtype=float32), filters: 512, kernel_size: 5, strides: 2, padding: same, kwargs: {'name': 'conv_16_0_conv2d'}) 12/06/2019 18:53:06 MainProcess training_0 nn_blocks set_default_initializer DEBUG Set default kernel_initializer to: <keras.initializers.VarianceScaling object at 0x7f37e078d438> 12/06/2019 18:53:06 MainProcess training_0 nn_blocks conv DEBUG inp: Tensor("conv_16_0_leakyrelu/LeakyRelu:0", shape=(?, 8, 8, 512), dtype=float32), filters: 1024, kernel_size: 5, strides: 2, use_instance_norm: False, kwargs: {}) 12/06/2019 18:53:06 MainProcess training_0 nn_blocks get_name DEBUG Generating block name: conv_8_0 12/06/2019 18:53:06 MainProcess training_0 nn_blocks conv2d DEBUG inp: Tensor("conv_16_0_leakyrelu/LeakyRelu:0", shape=(?, 8, 8, 512), dtype=float32), filters: 1024, kernel_size: 5, strides: 2, padding: same, kwargs: {'name': 'conv_8_0_conv2d'}) 12/06/2019 18:53:06 MainProcess training_0 nn_blocks set_default_initializer DEBUG Set default kernel_initializer to: <keras.initializers.VarianceScaling object at 0x7f37e07afc18> 12/06/2019 18:53:07 MainProcess training_0 nn_blocks upscale DEBUG inp: Tensor("reshape_1/Reshape:0", shape=(?, 4, 4, 1024), dtype=float32), filters: 512, kernel_size: 3, use_instance_norm: False, kwargs: {}) 12/06/2019 18:53:07 MainProcess training_0 nn_blocks get_name DEBUG Generating block name: upscale_4_0 12/06/2019 18:53:07 MainProcess training_0 nn_blocks set_default_initializer DEBUG Set default kernel_initializer to: <keras.initializers.VarianceScaling object at 0x7f37e06fec50> 12/06/2019 18:53:07 MainProcess training_0 nn_blocks conv2d DEBUG inp: Tensor("reshape_1/Reshape:0", shape=(?, 4, 4, 1024), dtype=float32), filters: 2048, kernel_size: 3, strides: (1, 1), padding: same, kwargs: {'name': 'upscale_4_0_conv2d', 'kernel_initializer': <keras.initializers.VarianceScaling object at 0x7f37e06fec50>}) 12/06/2019 18:53:07 MainProcess training_0 nn_blocks set_default_initializer DEBUG Using model specified initializer: <keras.initializers.VarianceScaling object at 0x7f37e06fec50> 12/06/2019 18:53:07 MainProcess training_0 _base add_network DEBUG network_type: 'encoder', side: 'None', network: '<keras.engine.training.Model object at 0x7f37e0736be0>', is_output: False 12/06/2019 18:53:07 MainProcess training_0 _base name DEBUG model name: 'original' 12/06/2019 18:53:07 MainProcess training_0 _base add_network DEBUG name: 'encoder', filename: 'original_encoder.h5' 12/06/2019 18:53:07 MainProcess training_0 _base init DEBUG Initializing NNMeta: (filename: '/home/tf1/Documents/Deepfakes/faceswap/model/oss_adrien/original_encoder.h5', network_type: 'encoder', side: 'None', network: <keras.engine.training.Model object at 0x7f37e0736be0>, is_output: False 12/06/2019 18:53:07 MainProcess training_0 _base init DEBUG Initialized NNMeta 12/06/2019 18:53:07 MainProcess training_0 original add_networks DEBUG Added networks 12/06/2019 18:53:07 MainProcess training_0 _base load_models DEBUG Load model: (swapped: False) 12/06/2019 18:53:07 MainProcess training_0 _base models_exist DEBUG Pre-existing models exist: False 12/06/2019 18:53:07 MainProcess training_0 _base name DEBUG model name: 'original' 12/06/2019 18:53:07 MainProcess training_0 _base load_models INFO Creating new 'original' model in folder: '/home/tf1/Documents/Deepfakes/faceswap/model/oss_adrien' 12/06/2019 18:53:07 MainProcess training_0 _base get_inputs DEBUG Getting inputs 12/06/2019 18:53:07 MainProcess training_0 _base get_inputs DEBUG Got inputs: [<tf.Tensor 'face_in:0' shape=(?, 64, 64, 3) dtype=float32>] 12/06/2019 18:53:07 MainProcess training_0 original build_autoencoders DEBUG Initializing model 12/06/2019 18:53:07 MainProcess training_0 original build_autoencoders DEBUG Adding Autoencoder. Side: a 12/06/2019 18:53:07 MainProcess training_0 _base add_predictor DEBUG Adding predictor: (side: 'a', model: <keras.engine.training.Model object at 0x7f37e06f40b8>) 12/06/2019 18:53:07 MainProcess training_0 _base add_predictor DEBUG Converting to multi-gpu: side a 12/06/2019 18:53:07 MainProcess training_0 multithreading run DEBUG Error in thread (training_0): To call multi_gpu_model with gpus=2, we expect the following devices to be available: ['/cpu:0', '/gpu:0', '/gpu:1']. However this machine only has: ['/cpu:0']. Try reducing gpus. 12/06/2019 18:53:08 MainProcess MainThread train monitor DEBUG Thread error detected 12/06/2019 18:53:08 MainProcess MainThread train monitor DEBUG Closed Monitor 12/06/2019 18:53:08 MainProcess MainThread train end_thread DEBUG Ending Training thread 12/06/2019 18:53:08 MainProcess MainThread train end_thread CRITICAL Error caught! Exiting... 12/06/2019 18:53:08 MainProcess MainThread multithreading join DEBUG Joining Threads: 'training' 12/06/2019 18:53:08 MainProcess MainThread multithreading join DEBUG Joining Thread: 'training_0' 12/06/2019 18:53:08 MainProcess MainThread multithreading join ERROR Caught exception in thread: 'training_0' 12/06/2019 18:53:08 MainProcess MainThread cli execute_script ERROR To call multi_gpu_model with gpus=2, we expect the following devices to be available: ['/cpu:0', '/gpu:0', '/gpu:1']. However this machine only has: ['/cpu:0']. Try reducing gpus. Traceback (most recent call last): File "/home/tf1/Documents/Deepfakes/faceswap/plugins/train/model/_base.py", line 244, in build self.build_autoencoders(inputs) File "/home/tf1/Documents/Deepfakes/faceswap/plugins/train/model/original.py", line 44, in build_autoencoders self.add_predictor(side, autoencoder) File "/home/tf1/Documents/Deepfakes/faceswap/plugins/train/model/_base.py", line 323, in add_predictor model = multi_gpu_model(model, self.gpus) File "/home/tf1/.local/lib/python3.6/site-packages/keras/utils/multi_gpu_utils.py", line 184, in multi_gpu_model available_devices)) ValueError: To call multi_gpu_model with gpus=2, we expect the following devices to be available: ['/cpu:0', '/gpu:0', '/gpu:1']. However this machine only has: ['/cpu:0']. Try reducing gpus.

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/home/tf1/Documents/Deepfakes/faceswap/lib/cli.py", line 128, in execute_script process.process() File "/home/tf1/Documents/Deepfakes/faceswap/scripts/train.py", line 109, in process self.end_thread(thread, err) File "/home/tf1/Documents/Deepfakes/faceswap/scripts/train.py", line 135, in end_thread thread.join() File "/home/tf1/Documents/Deepfakes/faceswap/lib/multithreading.py", line 121, in join raise thread.err[1].with_traceback(thread.err[2]) File "/home/tf1/Documents/Deepfakes/faceswap/lib/multithreading.py", line 37, in run self._target(*self._args, *self._kwargs) File "/home/tf1/Documents/Deepfakes/faceswap/scripts/train.py", line 160, in training raise err File "/home/tf1/Documents/Deepfakes/faceswap/scripts/train.py", line 148, in training model = self.load_model() File "/home/tf1/Documents/Deepfakes/faceswap/scripts/train.py", line 183, in load_model predict=False) File "/home/tf1/Documents/Deepfakes/faceswap/plugins/train/model/original.py", line 25, in init super().init(args, **kwargs) File "/home/tf1/Documents/Deepfakes/faceswap/plugins/train/model/_base.py", line 115, in init self.build() File "/home/tf1/Documents/Deepfakes/faceswap/plugins/train/model/_base.py", line 253, in build raise FaceswapError(str(err)) from err lib.utils.FaceswapError: To call multi_gpu_model with gpus=2, we expect the following devices to be available: ['/cpu:0', '/gpu:0', '/gpu:1']. However this machine only has: ['/cpu:0']. Try reducing gpus.

============ System Information ============ encoding: UTF-8 git_branch: Sur la branche master git_commits: 16aca4a bugfix - gui - Don't raise error when clearing slider entry widget. 9db67ce bugfix - ImagesSaver - Auto prepend destination folder. 4630977 plugins.extract - Create ExtractMedia class for pipeline flow Bugfix - Fix memory leak in extract. 18660da lib.gui - Resize icons and add attribution. 4591a02 bugfix: tools.preview - Fix for new icons bugfix: lib.vgg_face2_keras - Remove debug code documentation: lib.vgg_face2_keras gpu_cuda: 9.1 gpu_cudnn: No global version found gpu_devices: GPU_0: GeForce RTX 2080 Ti, GPU_1: GeForce RTX 2080 Ti gpu_devices_active: GPU_0, GPU_1 gpu_driver: 430.50 gpu_vram: GPU_0: 11019MB, GPU_1: 11019MB os_machine: x86_64 os_platform: Linux-5.0.0-37-generic-x86_64-with-Ubuntu-18.04-bionic os_release: 5.0.0-37-generic py_command: faceswap.py train -A extract/oss/ -B extract/adrien/ -g 2 -m model/oss_adrien/ -p py_conda_version: N/A py_implementation: CPython py_version: 3.6.9 py_virtual_env: False sys_cores: 16 sys_processor: x86_64 sys_ram: Total: 7810MB, Available: 4761MB, Used: 2436MB, Free: 2618MB

=============== Pip Packages =============== absl-py==0.8.1 apturl==0.5.2 asn1crypto==0.24.0 astor==0.8.0 beautifulsoup4==4.6.0 Brlapi==0.6.6 certifi==2019.9.11 chardet==3.0.4 Click==7.0 command-not-found==0.3 cryptography==2.1.4 cupshelpers==1.0 cycler==0.10.0 Cython==0.26.1 decorator==4.4.1 defer==1.0.6 distro-info===0.18ubuntu0.18.04.1 dlib==19.18.0 dominate==2.4.0 face-alignment==1.0.0 face-recognition==1.2.3 face-recognition-models==0.3.0 fastcluster==1.1.25 ffmpy==0.2.2 gast==0.3.2 gobject==0.1.0 google-pasta==0.1.8 grpcio==1.25.0 h5py==2.10.0 html5lib==0.999999999 httplib2==0.9.2 idna==2.8 imageio==2.6.1 imageio-ffmpeg==0.3.0 ipython==5.5.0 ipython-genutils==0.2.0 joblib==0.14.0 jsonpatch==1.24 jsonpointer==2.0 Keras==2.3.1 Keras-Applications==1.0.8 keras-contrib==2.0.8 Keras-Preprocessing==1.1.0 keyring==10.6.0 keyrings.alt==3.0 kiwisolver==1.1.0 language-selector==0.1 launchpadlib==1.10.6 lazr.restfulclient==0.13.5 lazr.uri==1.0.3 leveldb==0.1 linux-thermaltake-rgb==0.2.0.post1564303991 louis==3.5.0 lxml==4.2.1 macaroonbakery==1.1.3 Mako==1.0.7 Markdown==3.1.1 MarkupSafe==1.0 matplotlib==3.1.2 netifaces==0.10.4 networkx==2.4 nose==1.3.7 numexpr==2.6.4 numpy==1.17.4 nvidia-ml-py3==7.352.0 oauth==1.0.1 olefile==0.45.1 opencv-python==4.1.2.30 pandas==0.22.0 pathlib==1.0.1 pexpect==4.2.1 pickleshare==0.7.4 Pillow==6.2.1 prompt-toolkit==1.0.15 protobuf==3.11.1 psutil==5.6.7 pycairo==1.16.2 pycrypto==2.6.1 pycups==1.9.73 Pygments==2.2.0 pygobject==3.26.1 pymacaroons==0.13.0 PyNaCl==1.1.2 pyparsing==2.4.5 pyRFC3339==1.0 python-apt==1.6.4 python-dateutil==2.8.1 python-debian==0.1.32 python-gflags==1.5.1 pytz==2018.3 pyusb==1.0.2 PyWavelets==1.1.1 pyxdg==0.25 PyYAML==5.2 pyzmq==18.1.0 reportlab==3.4.0 requests==2.22.0 requests-unixsocket==0.1.5 scandir==1.10.0 scikit-image==0.16.2 scikit-learn==0.22 scipy==1.3.3 screen-resolution-extra==0.0.0 SecretStorage==2.3.1 simplegeneric==0.8.1 simplejson==3.13.2 six==1.13.0 system-service==0.3 systemd-python==234 tables==3.4.2 tensorboard==1.14.0 tensorflow==1.14.0 tensorflow-estimator==1.14.0 tensorflow-gpu==1.14.0 termcolor==1.1.0 tk==0.1.0 toposort==1.5 torch==1.3.1 torchfile==0.1.0 torchvision==0.4.0 tornado==6.0.3 tqdm==4.40.0 traitlets==4.3.2 ubuntu-drivers-common==0.0.0 ufw==0.36 unattended-upgrades==0.1 urllib3==1.25.6 usb-creator==0.3.3 visdom==0.1.8.9 wadllib==1.3.2 wcwidth==0.1.7 webencodings==0.5 websocket-client==0.56.0 Werkzeug==0.16.0 wrapt==1.11.2 xkit==0.0.0 zope.interface==4.3.2

================= Configs ================== --------- .faceswap --------- backend: nvidia

--------- train.ini ---------

[global] coverage: 68.75 mask_type: none mask_blur: False icnr_init: False conv_aware_init: False subpixel_upscaling: False reflect_padding: False penalized_mask_loss: True loss_function: mae learning_rate: 5e-05

[model.realface] input_size: 64 output_size: 128 dense_nodes: 1536 complexity_encoder: 128 complexity_decoder: 512

[model.villain] lowmem: False

[model.dlight] features: best details: good output_size: 256

[model.original] lowmem: False

[model.unbalanced] input_size: 128 lowmem: False clipnorm: True nodes: 1024 complexity_encoder: 128 complexity_decoder_a: 384 complexity_decoder_b: 512

[model.dfl_sae] input_size: 128 clipnorm: True architecture: df autoencoder_dims: 0 encoder_dims: 42 decoder_dims: 21 multiscale_decoder: False

[model.dfl_h128] lowmem: False

[trainer.original] preview_images: 14 zoom_amount: 5 rotation_range: 10 shift_range: 5 flip_chance: 50 color_lightness: 30 color_ab: 8 color_clahe_chance: 50 color_clahe_max_size: 4

--------- extract.ini ---------

[global] allow_growth: False

[detect.cv2_dnn] confidence: 50

[detect.mtcnn] minsize: 20 threshold_1: 0.6 threshold_2: 0.7 threshold_3: 0.7 scalefactor: 0.709 batch-size: 8

[detect.s3fd] confidence: 70 batch-size: 4

[align.fan] batch-size: 12

[mask.vgg_obstructed] batch-size: 2

[mask.vgg_clear] batch-size: 6

[mask.unet_dfl] batch-size: 8

kvrooman commented 4 years ago

Tensorflow 1.14 is not compatible with the keras/multi_gpu_model code due to a bug in the more recent 1.14 version.

This has been noted for a while ( https://github.com/keras-team/keras/issues/13057 & https://github.com/tensorflow/tensorflow/issues/30728 ). There has been some chatter about whether Keras 2.3.0 has fixed this or not.

Suggest downgrading to TF 1.13 as I'm sure that versioning works with everything.

amallecourt commented 4 years ago

You were right, it was a problem of compatibility between libraries. But in my case, the downgrading of tensorflow was not suffisant, the problem came from CUDA's version, which was not compatible with the tensorflow-gpu. Indeed, no matter the version, it doesn't seem to work with CUDA 9.1.

You have to either get CUDA 9.0 or 10.0, and after get careful with the tensorflow version as mentioned in the requirements.txt : 1.12.0<=tensorflow-gpu<=1.13.0 for CUDA 9.0 and 1.13.1<=tensorflow-gpu<1.15 for CUDA 10.0.

In my case, I upgraded CUDA to 10.0, because the official Nvidia's archives don't distribute the 9.0 version for my version of OS : Linux Ubuntu X86_64 18.04.

So it's now working with the following configuration :