iperov / DeepFaceLab

DeepFaceLab is the leading software for creating deepfakes.
GNU General Public License v3.0
46.41k stars 10.4k forks source link

Dose rtx3080 work well on DFL? #906

Open megascomnenus opened 3 years ago

megascomnenus commented 3 years ago

Hf8PxDyh 5XRRFIdh RN1Gj7bh

My 3080 dosent't work on dfl.

It works just by using cpu.

I habe the lastest nvidia driver installed and DFL updated in August but it couldn't work.

It works good elsewhere. Ex) counter strike, valorant, lol, 3dmark, video

Even CUDA driver couldn't help me either.

test1230-lab commented 3 years ago

can you post screenshot of nvidia-smi

megascomnenus commented 3 years ago

can you post screenshot of nvidia-smi

1 Idle state

2 Ethereum mining state

iperov commented 3 years ago

seems like RTX 3k breaks backward compatibility with CUDA 9.2

test1230-lab commented 3 years ago

Could dfl not run on a newer version of cuda?

dream80 commented 3 years ago

You need Cuda11+TensorFlow1.15.2 nv version。 I successfully run DFL on A100, 3080 is the same architecture

test1230-lab commented 3 years ago

Ok so the "binary" must be updated to replace the old included cuda version

megascomnenus commented 3 years ago

You need Cuda11+TensorFlow1.15.2 nv version。 I successfully run DFL on A100, 3080 is the same architecture

cuda anaconda

Thank you for answer.

CUDA 11 + TensorFlow 2.3.0 CUDA 11 + TensorFlow 1.15.2 CUDA 10.1 + Tensorflow 2.3.0

I installed it like this, but it didn't work. I don't know which one is the problem.

If other 3080 users test this, we'll see what's wrong.

test1230-lab commented 3 years ago

did you run with the bat files? also dfl is not tf 2.x

iperov commented 3 years ago

try this build https://mega.nz/file/WgVX3QZb#mM-4gY87qWHrLON6SbJfeBUmRmNZlhaHOOJWWt-aV3k

megascomnenus commented 3 years ago

try this build https://mega.nz/file/WgVX3QZb#mM-4gY87qWHrLON6SbJfeBUmRmNZlhaHOOJWWt-aV3k

cuda tensor

The picture above is the version currently installed. CUDA 11, Tensorflow 1.15.2

I downloaded the link you gave and tried, but it doesn't work.

I uploaded a video trying to extract a face. https://youtu.be/8HvWs9ZaDpQ

iperov commented 3 years ago

try this https://mega.nz/file/WskhTaqJ#nqPU7cnfV3QZnBt3PgQSZ7F1lV_gCyK6F5i0ArItRuA

dream80 commented 3 years ago

clipboard Special Tensorflow 1.15.2 version is required, compiled by NVIDIA ,This is very important . I use Ubuntu 18.04. If you use DFL window build , you need to replace TF version and cuda cudnn DLL

megascomnenus commented 3 years ago

clipboard Special Tensorflow 1.15.2 version is required, compiled by NVIDIA ,This is very important . I use Ubuntu 18.04. If you use DFL window build , you need to replace TF version and cuda cudnn DLL

From what I look for, there seems to be no Windows version of'tensorflow 1.15.2+nv'.

https://developer.nvidia.com/embedded/downloads#?search=tensorflow I found it on this link, but couldn't find the windows version.

Instead, I installed CUDA 10, cuDNN 7.6.5, TensorFlow 1.15.2.

gpu recognition in TensorFlow comes out as follows.

Python 3.6.5 (v3.6.5:f59c0932b4, Mar 28 2018, 17:00:18) [MSC v.1900 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information.

import tensorflow as tf 2020-09-25 06:20:26.405668: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll tf.version '1.15.2' from tensorflow.python.client import device_lib device_lib.list_local_devices() 2020-09-25 06:20:33.028850: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 2020-09-25 06:20:33.036937: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll 2020-09-25 06:20:33.067991: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1639] Found device 0 with properties: name: GeForce RTX 3080 major: 8 minor: 6 memoryClockRate(GHz): 1.74 pciBusID: 0000:26:00.0 2020-09-25 06:20:33.075671: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll 2020-09-25 06:20:33.083759: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll 2020-09-25 06:20:33.092244: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_100.dll 2020-09-25 06:20:33.098834: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_100.dll 2020-09-25 06:20:33.107826: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_100.dll 2020-09-25 06:20:33.115569: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_100.dll 2020-09-25 06:20:33.127998: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll 2020-09-25 06:20:33.134070: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1767] Adding visible gpu devices: 0 2020-09-25 06:20:34.029890: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1180] Device interconnect StreamExecutor with strength 1 edge matrix: 2020-09-25 06:20:34.035905: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1186] 0 2020-09-25 06:20:34.040880: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1199] 0: N 2020-09-25 06:20:34.044904: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1325] Created TensorFlow device (/device:GPU:0 with 7844 MB memory) -> physical GPU (device: 0, name: GeForce RTX 3080, pci bus id: 0000:26:00.0, compute capability: 8.6) [name: "/device:CPU:0" device_type: "CPU" memory_limit: 268435456 locality { } incarnation: 13241750559676449420 , name: "/device:GPU:0" device_type: "GPU" memory_limit: 8225635697 locality { bus_id: 1 links { } } incarnation: 13019861911155521269 physical_device_desc: "device: 0, name: GeForce RTX 3080, pci bus id: 0000:26:00.0, compute capability: 8.6" ]

try this https://mega.nz/file/WskhTaqJ#nqPU7cnfV3QZnBt3PgQSZ7F1lV_gCyK6F5i0ArItRuA

Still doesn't work.

iperov commented 3 years ago

last build is tf 15.2 cuda 10 cudnn 7.6.0 version is the same as used in tf 15.2 compilation

megascomnenus commented 3 years ago

last build is tf 15.2 cuda 10 cudnn 7.6.0 version is the same as used in tf 15.2 compilation

then CUDA 10 (not 10.1) TensorFlow 1.15.2 (not nv version) cuDNN 7.6.0 Is it correct to install?

I changed to cuDNN 7.6.0, but it doesn't work.

iperov commented 3 years ago

you don't need to install anything when using DFL builds.

megascomnenus commented 3 years ago

you don't need to install anything when using DFL builds.

if so, can I stop trying and wait for the DFL to be updated?

iperov commented 3 years ago

currently there is no solution how to run 3080 with DFL. Because tensorflow doesn't support it.

megascomnenus commented 3 years ago

currently there is no solution how to run 3080 with DFL. Because tensorflow doesn't support it.

Then I have to wait until TensorFlow supports DFL to operate RTX 3080. Thanks for your answer.

test1230-lab commented 3 years ago

how about the version of tf @dream80 mentioned?

megascomnenus commented 3 years ago

how about the version of tf @dream80 mentioned?

That version, as far as I know, is for Linux only.

I can't use it because I use Windows.

yuanshiyuanyi commented 3 years ago

您需要Cuda11 + TensorFlow1.15.2 nv版本。我在A100上成功运行DFL,3080是相同的架构

Can you share your DFL? Thank you

megascomnenus commented 3 years ago

I found that the RTX 3000 only works properly with CUDA 11.1, which was updated on Sep, 23.

But, as far as I know, even the latest version of TensorFlow, version 2.3.0, supports only CUDA 10.

I hope tensorflow 2.4.0 is newly released and DFL supports this.

https://news.developer.nvidia.com/cuda-11-1-introduces-support-rtx-30-series/

megascomnenus commented 3 years ago

I'm not sure, but there are some sayings that TensorFlow 1.15 can support cuda 11.1. https://news.developer.nvidia.com/developer-blog-accelerating-tensorflow-on-nvidia-a100-gpus/

He succeeded in running Tensorflow 1.15 with the RTX 3000. https://www.pugetsystems.com/labs/hpc/RTX3090-TensorFlow-NAMD-and-HPCG-Performance-on-Linux-Preliminary-1902/

iperov commented 3 years ago

@megascomnenus thx, I will test tf 1.15 with cuda 11.1 right now with rtx 2K

iperov commented 3 years ago

works

Now somebody test with RTX 3k

https://mega.nz/file/Pgs0jKrI#ptg3INE95knjWVC4XDmAp7-VlWO6l2gE2xoQfhtftIM

megascomnenus commented 3 years ago

works

Now somebody test with RTX 3k

https://mega.nz/file/Pgs0jKrI#ptg3INE95knjWVC4XDmAp7-VlWO6l2gE2xoQfhtftIM

I am testing now. It's slow, but it works.

https://youtu.be/eb-ZANifM_4

iperov commented 3 years ago

test saehd iteration time

megascomnenus commented 3 years ago

Running trainer.

[new] No saved models found. Enter a name of a new model : new

Model first run.

Choose one or several GPU idxs (separated by comma).

[CPU] : CPU [0] : GeForce RTX 3080

[0] Which GPU indexes to choose? : 0

[0] Autobackup every N hour ( 0..24 ?:help ) : 0 [n] Write preview history ( y/n ?:help ) : n [0] Target iteration : 0 [y] Flip faces randomly ( y/n ?:help ) : y [8] Batch_size ( ?:help ) : 8 [128] Resolution ( 64-640 ?:help ) : 128 [f] Face type ( h/mf/f/wf/head ?:help ) : f [df] AE architecture ( ?:help ) : df [256] AutoEncoder dimensions ( 32-1024 ?:help ) : 256 [64] Encoder dimensions ( 16-256 ?:help ) : 64 [64] Decoder dimensions ( 16-256 ?:help ) : 64 [22] Decoder mask dimensions ( 16-256 ?:help ) : 22 [n] Eyes priority ( y/n ?:help ) : n [n] Uniform yaw distribution of samples ( y/n ?:help ) : n [y] Place models and optimizer on GPU ( y/n ?:help ) : y [n] Use learning rate dropout ( n/y/cpu ?:help ) : n [y] Enable random warp of samples ( y/n ?:help ) : y [0.0] GAN power ( 0.0 .. 10.0 ?:help ) : 0.0 [0.0] 'True face' power. ( 0.0000 .. 1.0 ?:help ) : 0.0 [0.0] Face style power ( 0.0..100.0 ?:help ) : 0.0 [0.0] Background style power ( 0.0..100.0 ?:help ) : 0.0 [none] Color transfer for src faceset ( none/rct/lct/mkl/idt/sot ?:help ) : none [n] Enable gradient clipping ( y/n ?:help ) : n [n] Enable pretraining mode ( y/n ?:help ) : n Initializing models: 0%| | 0/5 [00:00<?, ?it/s] Error: Cannot assign a device for operation encoder/down1/downs_0/conv1/weight/Initializer/cai: Could not satisfy explicit device specification '' because the node node encoder/down1/downs_0/conv1/weight/Initializer/cai (defined at C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\framework\ops.py:1748) placed on device Device assignments active during op 'encoder/down1/downs_0/conv1/weight/Initializer/cai' creation: with tf.device(None): <C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\variables.py:1816> with tf.device(/GPU:0): <C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\models\Model_SAEHD\Model.py:233> was colocated with a group of nodes that required incompatible device '/device:GPU:0'. All available devices [/job:localhost/replica:0/task:0/device:CPU:0]. Colocation Debug Info: Colocation group had the following types and supported devices: Root Member(assigned_device_nameindex=-1 requested_devicename='/device:GPU:0' assigned_devicename='' resource_devicename='/device:GPU:0' supported_devicetypes=[CPU] possibledevices=[] Assign: CPU Const: CPU Fill: CPU VariableV2: CPU Identity: CPU

Colocation members, user-requested devices, and framework assigned devices, if any: encoder/down1/downs_0/conv1/weight/Initializer/cai/shape_as_tensor (Const) encoder/down1/downs_0/conv1/weight/Initializer/cai/Const (Const) encoder/down1/downs_0/conv1/weight/Initializer/cai (Fill) encoder/down1/downs_0/conv1/weight (VariableV2) /device:GPU:0 encoder/down1/downs_0/conv1/weight/Assign (Assign) /device:GPU:0 encoder/down1/downs_0/conv1/weight/read (Identity) /device:GPU:0 Assign_1 (Assign) /device:GPU:0

     [[node encoder/down1/downs_0/conv1/weight/Initializer/_cai_ (defined at C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA\_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\framework\ops.py:1748) ]]Additional information about colocations:No node-device colocations were active during op 'encoder/down1/downs_0/conv1/weight/Initializer/_cai_' creation.

Device assignments active during op 'encoder/down1/downs_0/conv1/weight/Initializer/cai' creation: with tf.device(None): <C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\variables.py:1816> with tf.device(/GPU:0): <C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\models\Model_SAEHD\Model.py:233>

Original stack trace for 'encoder/down1/downs_0/conv1/weight/Initializer/cai': File "threading.py", line 884, in _bootstrap File "threading.py", line 916, in _bootstrap_inner File "threading.py", line 864, in run File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\mainscripts\Trainer.py", line 57, in trainerThread debug=debug, File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\models\ModelBase.py", line 189, in init self.on_initialize() File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\models\Model_SAEHD\Model.py", line 236, in on_initialize encoder_out_ch = self.encoder.compute_output_channels ( (nn.floatx, bgr_shape)) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\models\ModelBase.py", line 147, in compute_output_channels shape = self.compute_output_shape(shapes) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\models\ModelBase.py", line 121, in compute_output_shape self.build() File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\models\ModelBase.py", line 65, in build self._build_sub(v[name],name) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\models\ModelBase.py", line 35, in _build_sub layer.build() File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\models\ModelBase.py", line 65, in build self._build_sub(v[name],name) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\models\ModelBase.py", line 20, in _build_sub self._buildsub(sublayer, f"{name}{i}") File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\models\ModelBase.py", line 35, in _build_sub layer.build() File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\models\ModelBase.py", line 65, in build self._build_sub(v[name],name) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\models\ModelBase.py", line 33, in _build_sub layer.build_weights() File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\layers\Conv2D.py", line 76, in build_weights self.weight = tf.get_variable("weight", (self.kernel_size,self.kernel_size,self.in_ch,self.out_ch), dtype=self.dtype, initializer=kernel_initializer, trainable=self.trainable ) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\variable_scope.py", line 1500, in get_variable aggregation=aggregation) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\variable_scope.py", line 1243, in get_variable aggregation=aggregation) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\variable_scope.py", line 567, in get_variable aggregation=aggregation) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\variable_scope.py", line 519, in _true_getter aggregation=aggregation) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\variable_scope.py", line 933, in _get_single_variable aggregation=aggregation) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\variables.py", line 258, in call return cls._variable_v1_call(*args, kwargs) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\variables.py", line 219, in _variable_v1_call shape=shape) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\variables.py", line 197, in previous_getter = lambda kwargs: default_variable_creator(None, kwargs) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\variable_scope.py", line 2519, in default_variable_creator shape=shape) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\variables.py", line 262, in call return super(VariableMetaclass, cls).call(*args, *kwargs) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\variables.py", line 1688, in init shape=shape) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\variables.py", line 1818, in _init_from_args initial_value(), name="initial_value", dtype=dtype) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\variable_scope.py", line 905, in partition_info=partition_info) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\initializers__init.py", line 13, in call__ return tf.zeros( shape, dtype=dtype, name="cai") File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\array_ops.py", line 2350, in zeros output = fill(shape, constant(zero, dtype=dtype), name=name) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\array_ops.py", line 171, in fill result = gen_array_ops.fill(dims, value, name=name) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\gen_array_ops.py", line 3602, in fill "Fill", dims=dims, value=value, name=name) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\framework\op_def_library.py", line 794, in _apply_op_helper op_def=op_def) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\util\deprecation.py", line 507, in new_func return func(args, kwargs) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\framework\ops.py", line 3357, in create_op attrs, op_def, compute_device) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\framework\ops.py", line 3426, in _create_op_internal op_def=op_def) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\framework\ops.py", line 1748, in init self._traceback = tf_stack.extract_stack()

Traceback (most recent call last): File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\client\session.py", line 1365, in _do_call return fn(*args) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\client\session.py", line 1348, in _run_fn self._extend_graph() File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\client\session.py", line 1388, in _extend_graph tf_session.ExtendSession(self._session) tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot assign a device for operation encoder/down1/downs_0/conv1/weight/Initializer/cai: Could not satisfy explicit device specification '' because the node {{colocation_node encoder/down1/downs_0/conv1/weight/Initializer/cai}} was colocated with a group of nodes that required incompatible device '/device:GPU:0'. All available devices [/job:localhost/replica:0/task:0/device:CPU:0]. Colocation Debug Info: Colocation group had the following types and supported devices: Root Member(assigned_device_nameindex=-1 requested_devicename='/device:GPU:0' assigned_devicename='' resource_devicename='/device:GPU:0' supported_devicetypes=[CPU] possibledevices=[] Assign: CPU Const: CPU Fill: CPU VariableV2: CPU Identity: CPU

Colocation members, user-requested devices, and framework assigned devices, if any: encoder/down1/downs_0/conv1/weight/Initializer/cai/shape_as_tensor (Const) encoder/down1/downs_0/conv1/weight/Initializer/cai/Const (Const) encoder/down1/downs_0/conv1/weight/Initializer/cai (Fill) encoder/down1/downs_0/conv1/weight (VariableV2) /device:GPU:0 encoder/down1/downs_0/conv1/weight/Assign (Assign) /device:GPU:0 encoder/down1/downs_0/conv1/weight/read (Identity) /device:GPU:0 Assign_1 (Assign) /device:GPU:0

     [[{{node encoder/down1/downs_0/conv1/weight/Initializer/_cai_}}]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\mainscripts\Trainer.py", line 57, in trainerThread debug=debug, File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\models\ModelBase.py", line 189, in init self.on_initialize() File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\models\Model_SAEHD\Model.py", line 571, in on_initialize model.init_weights() File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\layers\Saveable.py", line 101, in init_weights nn.init_weights(self.get_weights()) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\ops__init__.py", line 48, in init_weights nn.tf_sess.run (ops) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\client\session.py", line 956, in run run_metadata_ptr) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\client\session.py", line 1180, in _run feed_dict_tensor, options, run_metadata) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\client\session.py", line 1359, in _do_run run_metadata) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\client\session.py", line 1384, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot assign a device for operation encoder/down1/downs_0/conv1/weight/Initializer/cai: Could not satisfy explicit device specification '' because the node node encoder/down1/downs_0/conv1/weight/Initializer/cai (defined at C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\framework\ops.py:1748) placed on device Device assignments active during op 'encoder/down1/downs_0/conv1/weight/Initializer/cai' creation: with tf.device(None): <C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\variables.py:1816> with tf.device(/GPU:0): <C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\models\Model_SAEHD\Model.py:233> was colocated with a group of nodes that required incompatible device '/device:GPU:0'. All available devices [/job:localhost/replica:0/task:0/device:CPU:0]. Colocation Debug Info: Colocation group had the following types and supported devices: Root Member(assigned_device_nameindex=-1 requested_devicename='/device:GPU:0' assigned_devicename='' resource_devicename='/device:GPU:0' supported_devicetypes=[CPU] possibledevices=[] Assign: CPU Const: CPU Fill: CPU VariableV2: CPU Identity: CPU

Colocation members, user-requested devices, and framework assigned devices, if any: encoder/down1/downs_0/conv1/weight/Initializer/cai/shape_as_tensor (Const) encoder/down1/downs_0/conv1/weight/Initializer/cai/Const (Const) encoder/down1/downs_0/conv1/weight/Initializer/cai (Fill) encoder/down1/downs_0/conv1/weight (VariableV2) /device:GPU:0 encoder/down1/downs_0/conv1/weight/Assign (Assign) /device:GPU:0 encoder/down1/downs_0/conv1/weight/read (Identity) /device:GPU:0 Assign_1 (Assign) /device:GPU:0

     [[node encoder/down1/downs_0/conv1/weight/Initializer/_cai_ (defined at C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA\_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\framework\ops.py:1748) ]]Additional information about colocations:No node-device colocations were active during op 'encoder/down1/downs_0/conv1/weight/Initializer/_cai_' creation.

Device assignments active during op 'encoder/down1/downs_0/conv1/weight/Initializer/cai' creation: with tf.device(None): <C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\variables.py:1816> with tf.device(/GPU:0): <C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\models\Model_SAEHD\Model.py:233>

Original stack trace for 'encoder/down1/downs_0/conv1/weight/Initializer/cai': File "threading.py", line 884, in _bootstrap File "threading.py", line 916, in _bootstrap_inner File "threading.py", line 864, in run File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\mainscripts\Trainer.py", line 57, in trainerThread debug=debug, File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\models\ModelBase.py", line 189, in init self.on_initialize() File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\models\Model_SAEHD\Model.py", line 236, in on_initialize encoder_out_ch = self.encoder.compute_output_channels ( (nn.floatx, bgr_shape)) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\models\ModelBase.py", line 147, in compute_output_channels shape = self.compute_output_shape(shapes) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\models\ModelBase.py", line 121, in compute_output_shape self.build() File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\models\ModelBase.py", line 65, in build self._build_sub(v[name],name) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\models\ModelBase.py", line 35, in _build_sub layer.build() File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\models\ModelBase.py", line 65, in build self._build_sub(v[name],name) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\models\ModelBase.py", line 20, in _build_sub self._buildsub(sublayer, f"{name}{i}") File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\models\ModelBase.py", line 35, in _build_sub layer.build() File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\models\ModelBase.py", line 65, in build self._build_sub(v[name],name) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\models\ModelBase.py", line 33, in _build_sub layer.build_weights() File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\layers\Conv2D.py", line 76, in build_weights self.weight = tf.get_variable("weight", (self.kernel_size,self.kernel_size,self.in_ch,self.out_ch), dtype=self.dtype, initializer=kernel_initializer, trainable=self.trainable ) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\variable_scope.py", line 1500, in get_variable aggregation=aggregation) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\variable_scope.py", line 1243, in get_variable aggregation=aggregation) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\variable_scope.py", line 567, in get_variable aggregation=aggregation) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\variable_scope.py", line 519, in _true_getter aggregation=aggregation) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\variable_scope.py", line 933, in _get_single_variable aggregation=aggregation) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\variables.py", line 258, in call return cls._variable_v1_call(*args, kwargs) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\variables.py", line 219, in _variable_v1_call shape=shape) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\variables.py", line 197, in previous_getter = lambda kwargs: default_variable_creator(None, kwargs) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\variable_scope.py", line 2519, in default_variable_creator shape=shape) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\variables.py", line 262, in call return super(VariableMetaclass, cls).call(*args, *kwargs) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\variables.py", line 1688, in init shape=shape) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\variables.py", line 1818, in _init_from_args initial_value(), name="initial_value", dtype=dtype) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\variable_scope.py", line 905, in partition_info=partition_info) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\initializers__init.py", line 13, in call__ return tf.zeros( shape, dtype=dtype, name="cai") File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\array_ops.py", line 2350, in zeros output = fill(shape, constant(zero, dtype=dtype), name=name) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\array_ops.py", line 171, in fill result = gen_array_ops.fill(dims, value, name=name) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\gen_array_ops.py", line 3602, in fill "Fill", dims=dims, value=value, name=name) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\framework\op_def_library.py", line 794, in _apply_op_helper op_def=op_def) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\util\deprecation.py", line 507, in new_func return func(args, kwargs) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\framework\ops.py", line 3357, in create_op attrs, op_def, compute_device) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\framework\ops.py", line 3426, in _create_op_internal op_def=op_def) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\framework\ops.py", line 1748, in init self._traceback = tf_stack.extract_stack()

sufa5858 commented 3 years ago

I tried on my RTX 3090

Running trainer.

[new] No saved models found. Enter a name of a new model : new

Model first run.

Choose one or several GPU idxs (separated by comma).

[CPU] : CPU [0] : GeForce RTX 3090

[0] Which GPU indexes to choose? : 0

Caching GPU kernels... [0] Autobackup every N hour ( 0..24 ?:help ) : 0 [n] Write preview history ( y/n ?:help ) : n [0] Target iteration : 0 [y] Flip faces randomly ( y/n ?:help ) : y [8] Batch_size ( ?:help ) : 8 [128] Resolution ( 64-640 ?:help ) : 128 [f] Face type ( h/mf/f/wf/head ?:help ) : f [df] AE architecture ( ?:help ) : df [256] AutoEncoder dimensions ( 32-1024 ?:help ) : 256 [64] Encoder dimensions ( 16-256 ?:help ) : 64 [64] Decoder dimensions ( 16-256 ?:help ) : 64 [22] Decoder mask dimensions ( 16-256 ?:help ) : 22 [n] Eyes priority ( y/n ?:help ) : n [n] Uniform yaw distribution of samples ( y/n ?:help ) : n [y] Place models and optimizer on GPU ( y/n ?:help ) : y [n] Use learning rate dropout ( n/y/cpu ?:help ) : n [y] Enable random warp of samples ( y/n ?:help ) : y [0.0] GAN power ( 0.0 .. 10.0 ?:help ) : 0.0 [0.0] 'True face' power. ( 0.0000 .. 1.0 ?:help ) : 0.0 [0.0] Face style power ( 0.0..100.0 ?:help ) : 0.0 [0.0] Background style power ( 0.0..100.0 ?:help ) : 0.0 [none] Color transfer for src faceset ( none/rct/lct/mkl/idt/sot ?:help ) : none [n] Enable gradient clipping ( y/n ?:help ) : n [n] Enable pretraining mode ( y/n ?:help ) : y Initializing models: 100%|###############################################################| 5/5 [00:12<00:00, 2.56s/it] Loaded 15844 packed faces from C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\pretrain_CelebA Sort by yaw: 100%|##################################################################| 128/128 [00:00<00:00, 366.66it/s] Sort by yaw: 100%|##################################################################| 128/128 [00:00<00:00, 374.15it/s] =============== Model Summary =============== == == == Model name: new_SAEHD == == == == Current iteration: 0 == == == ==------------- Model Options -------------== == == == resolution: 128 == == face_type: f == == models_opt_on_gpu: True == == archi: df == == ae_dims: 256 == == e_dims: 64 == == d_dims: 64 == == d_mask_dims: 22 == == masked_training: True == == eyes_prio: False == == uniform_yaw: True == == lr_dropout: n == == random_warp: False == == gan_power: 0.0 == == true_face_power: 0.0 == == face_style_power: 0.0 == == bg_style_power: 0.0 == == ct_mode: none == == clipgrad: False == == pretrain: True == == autobackup_hour: 0 == == write_preview_history: False == == target_iter: 0 == == random_flip: True == == batch_size: 8 == == == ==-------------- Running On ---------------== == == == Device index: 0 == == Name: GeForce RTX 3090 == == VRAM: 24.00GB == == ==

Starting. Press "Enter" to stop training and save model.

Trying to do the first iteration. If an error occurs, reduce the model parameters.

2020-09-28 04:16:54.919484: E tensorflow/stream_executor/cuda/cuda_blas.cc:428] failed to run cuBLAS routine: CUBLAS_STATUS_EXECUTION_FAILED Error: Blas GEMM launch failed : a.shape=(8, 256), b.shape=(8, 16384), m=256, n=16384, k=8 [[node gradients/MatMul_3_grad/MatMul_1 (defined at C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\framework\ops.py:1748) ]]

Original stack trace for 'gradients/MatMul_3_grad/MatMul_1': File "threading.py", line 884, in _bootstrap File "threading.py", line 916, in _bootstrap_inner File "threading.py", line 864, in run File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\mainscripts\Trainer.py", line 57, in trainerThread debug=debug, File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\models\ModelBase.py", line 189, in init self.on_initialize() File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\models\Model_SAEHD\Model.py", line 471, in on_initialize gpu_G_loss_gvs += [ nn.gradients ( gpu_G_loss, self.src_dst_trainable_weights ) ] File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\ops__init.py", line 55, in tf_gradients grads = gradients.gradients(loss, vars, colocate_gradients_with_ops=True ) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\gradients_impl.py", line 158, in gradients unconnected_gradients) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\gradients_util.py", line 679, in _GradientsHelper lambda: grad_fn(op, out_grads)) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\gradients_util.py", line 350, in _MaybeCompile return grad_fn() # Exit early File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\gradients_util.py", line 679, in lambda: grad_fn(op, out_grads)) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\math_grad.py", line 1586, in _MatMulGrad grad_b = gen_math_ops.mat_mul(a, grad, transpose_a=True) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\gen_math_ops.py", line 6136, in mat_mul name=name) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\framework\op_def_library.py", line 794, in _apply_op_helper op_def=op_def) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\util\deprecation.py", line 507, in new_func return func(*args, **kwargs) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\framework\ops.py", line 3357, in create_op attrs, op_def, compute_device) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\framework\ops.py", line 3426, in _create_op_internal op_def=op_def) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\framework\ops.py", line 1748, in init__ self._traceback = tf_stack.extract_stack()

...which was originally created as op 'MatMul_3', defined at: File "threading.py", line 884, in _bootstrap [elided 3 identical lines from previous traceback] File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\models\ModelBase.py", line 189, in init self.on_initialize() File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\models\Model_SAEHD\Model.py", line 335, in on_initialize gpu_src_code = self.inter(self.encoder(gpu_warped_src)) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\models\ModelBase.py", line 117, in call return self.forward(*args, kwargs) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\archis\DeepFakeArchi.py", line 102, in forward x = self.dense2(x) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\layers\LayerBase.py", line 14, in call return self.forward(*args, *kwargs) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\layers\Dense.py", line 66, in forward x = tf.matmul(x, weight) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\util\dispatch.py", line 180, in wrapper return target(args, kwargs) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\math_ops.py", line 2754, in matmul a, b, transpose_a=transpose_a, transpose_b=transpose_b, name=name) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\gen_math_ops.py", line 6136, in mat_mul name=name)

Traceback (most recent call last): File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\client\session.py", line 1365, in _do_call return fn(*args) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\client\session.py", line 1350, in _run_fn target_list, run_metadata) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\client\session.py", line 1443, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.InternalError: Blas GEMM launch failed : a.shape=(8, 256), b.shape=(8, 16384), m=256, n=16384, k=8 [[{{node gradients/MatMul_3_grad/MatMul_1}}]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\mainscripts\Trainer.py", line 123, in trainerThread iter, iter_time = model.train_one_iter() File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\models\ModelBase.py", line 462, in train_one_iter losses = self.onTrainOneIter() File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\models\Model_SAEHD\Model.py", line 636, in onTrainOneIter src_loss, dst_loss = self.src_dst_train (warped_src, target_src, target_srcm_all, warped_dst, target_dst, target_dstm_all) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\models\Model_SAEHD\Model.py", line 503, in src_dst_train self.target_dstm_all:target_dstm_all, File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\client\session.py", line 956, in run run_metadata_ptr) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\client\session.py", line 1180, in _run feed_dict_tensor, options, run_metadata) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\client\session.py", line 1359, in _do_run run_metadata) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\client\session.py", line 1384, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.InternalError: Blas GEMM launch failed : a.shape=(8, 256), b.shape=(8, 16384), m=256, n=16384, k=8 [[node gradients/MatMul_3_grad/MatMul_1 (defined at C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\framework\ops.py:1748) ]]

Original stack trace for 'gradients/MatMul_3_grad/MatMul_1': File "threading.py", line 884, in _bootstrap File "threading.py", line 916, in _bootstrap_inner File "threading.py", line 864, in run File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\mainscripts\Trainer.py", line 57, in trainerThread debug=debug, File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\models\ModelBase.py", line 189, in init self.on_initialize() File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\models\Model_SAEHD\Model.py", line 471, in on_initialize gpu_G_loss_gvs += [ nn.gradients ( gpu_G_loss, self.src_dst_trainable_weights ) ] File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\ops__init.py", line 55, in tf_gradients grads = gradients.gradients(loss, vars, colocate_gradients_with_ops=True ) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\gradients_impl.py", line 158, in gradients unconnected_gradients) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\gradients_util.py", line 679, in _GradientsHelper lambda: grad_fn(op, out_grads)) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\gradients_util.py", line 350, in _MaybeCompile return grad_fn() # Exit early File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\gradients_util.py", line 679, in lambda: grad_fn(op, out_grads)) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\math_grad.py", line 1586, in _MatMulGrad grad_b = gen_math_ops.mat_mul(a, grad, transpose_a=True) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\gen_math_ops.py", line 6136, in mat_mul name=name) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\framework\op_def_library.py", line 794, in _apply_op_helper op_def=op_def) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\util\deprecation.py", line 507, in new_func return func(*args, **kwargs) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\framework\ops.py", line 3357, in create_op attrs, op_def, compute_device) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\framework\ops.py", line 3426, in _create_op_internal op_def=op_def) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\framework\ops.py", line 1748, in init__ self._traceback = tf_stack.extract_stack()

...which was originally created as op 'MatMul_3', defined at: File "threading.py", line 884, in _bootstrap [elided 3 identical lines from previous traceback] File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\models\ModelBase.py", line 189, in init self.on_initialize() File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\models\Model_SAEHD\Model.py", line 335, in on_initialize gpu_src_code = self.inter(self.encoder(gpu_warped_src)) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\models\ModelBase.py", line 117, in call return self.forward(*args, kwargs) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\archis\DeepFakeArchi.py", line 102, in forward x = self.dense2(x) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\layers\LayerBase.py", line 14, in call return self.forward(*args, *kwargs) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\layers\Dense.py", line 66, in forward x = tf.matmul(x, weight) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\util\dispatch.py", line 180, in wrapper return target(args, kwargs) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\math_ops.py", line 2754, in matmul a, b, transpose_a=transpose_a, transpose_b=transpose_b, name=name) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\gen_math_ops.py", line 6136, in mat_mul name=name)

devilemperor commented 3 years ago

@iperov The last build from 27.09.20 (which probably built for 3000) on RTX 2080 doesn't work "faceset extract". If you choose RTX 2080 it starts to works on the CPU only, and ignore the GPU.

iperov commented 3 years ago

seems like it does not work. tf 1.15 compiled with cuda 10 So cuda 11.1 dlls are not accepted

test1230-lab commented 3 years ago

@dream80 how did you install the special tf version on linux?

megascomnenus commented 3 years ago

tf 2.4.0 nightly version is constantly being updated and the latest version is 10.03.20 (tf-nightly is a beta test version) https://pypi.org/project/tf-nightly-gpu/

It is not clear if it supports rtx3000, but it is stated that it supports cuda 11. Also, I saw a post saying that cuda 11.1 + cudnn-8.0.4 + nv driver 456.43 + python3.7 or 3.8 + tf 2.4.0 dev succeeded. https://blog.csdn.net/tophuihui/article/details/108896615

Also, it seems that Linux can run even if it is not tf 2.4.0. If you are in a hurry, you can use Linux.

iperov commented 3 years ago

tf nightly is compiled with cuda 10.1.

Here are other problems with running rtx3 in windows.

megascomnenus commented 3 years ago

tf nightly is compiled with cuda 10.1.

Here are other problems with running rtx3 in windows.

125

Doesn't this mean it supports cuda 11?

And he succeeded rtx3080 and 3090 with tf 2.4.0. Run the translator and check. 'win10+Tensorflow + cuda +RTX 3090/3080 +cudnn' https://blog.csdn.net/tophuihui/article/details/108896615

Recently, the RTX3080 / 3090 has been released, greatly increasing the power of deep learning computing. I started testing as soon as possible, and it's actually a hydrogen bomb level! When I translate the article, the official version (tf 2.3.0) does not support cuda 11. The latest cuda11, cudnn, graphics card driver and tensorflow versions all require a one-to-one correspondence. 1.Install Cuda: cuda 11.1 + cudnn-v8.0.4

  1. nv driver 456.43 required
  2. Python 3.7 or 3.8 can be installed with Anaconda
  3. Install tensorflow nightly version https://pypi.org/project/tf-nightly-gpu (2.4.0 dev version)
  4. If tensorflow says that cusolver64_10.dll is missing, search the internet and copy it to the C:\Windows\System32 directory. You can also copy it to the old version's cuda directory. "_10" when "cusolver64_100" refers to version 10.0. dll ", rename it, delete the 0, then copy it, the effect is the same. The directory is usually "C:\Program Files\Nvidia GPU Computing Toolkit\CUDA\v1x.x\bin", on another machine or earlier. You can copy it from the installed version, besides this one, there may be other missing files, but the installation steps tell you that the files are missing.
  5. You can test using this project. [Polyp-Segmentation-using-UNET-in-TensorFlow-2.0]
iperov commented 3 years ago

https://blog.csdn.net/tophuihui/article/details/108896615

firefox_2020-10-04_14-54-12

he uses CPU version.

iperov commented 3 years ago

https://pypi.org/project/tf-nightly-gpu/

theese are only for linux, not windows.

iperov commented 3 years ago

I don't trust such chinesse pages. They are like from other planet.

sufa5858 commented 3 years ago

https://blog.csdn.net/tophuihui/article/details/108896615

firefox_2020-10-04_14-54-12

he uses CPU version.

He said he wrote it wrong. It’s GPU.

linhaohoward commented 3 years ago

anyone tried and proven this to work?

mikoim commented 3 years ago

For Linux🐧 users,

Nvidia provides a docker container image including Tensorflow 1.15.x and CUDA 11.1. It seems to work with my old GPU (RTX 2080). Please try this container image if you have 3000 series.

You need to install the latest Docker and NVIDIA Container Toolkit. https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#docker

docker pull nvcr.io/nvidia/tensorflow:20.09-tf1-py3
docker run -it --gpus all -v /home/foobar/data:/data nvcr.io/nvidia/tensorflow:20.09-tf1-py3 /bin/bash

# Tips
# 1. Do not install Tensorflow via pip. It has been already installed. Edit requirements-cuda.txt before installing deps.
linhaohoward commented 3 years ago

For Linux🐧 users,

Nvidia provides a docker container image including Tensorflow 1.15.x and CUDA 11.1. It seems to work with my old GPU (RTX 2080). Please try this container image if you have 3000 series.

You need to install the latest Docker and NVIDIA Container Toolkit. https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#docker

docker pull nvcr.io/nvidia/tensorflow:20.09-tf1-py3
docker run -it --gpus all -v /home/foobar/data:/data nvcr.io/nvidia/tensorflow:20.09-tf1-py3 /bin/bash

# Tips
# 1. Do not install Tensorflow via pip. It has been already installed. Edit requirements-cuda.txt before installing deps.

anyone tried if this works on DFL linux?

Jack29913 commented 3 years ago

Still no solution for windows?

iperov commented 3 years ago

compilation will take 1-2 weeks on my second comp unless tf releases 2.4.0 for windows

blanuk commented 3 years ago

@iperov u need CPU power?

dodorugefu commented 3 years ago

Thank you for compiling iperov. I'm one of those who bought it but can't use it (I want to donate very much, but none of the payments from my country work)

iperov commented 3 years ago

@blanuk yes I need powerful CPU comp , 32gb ram and win 7/10

iperov commented 3 years ago

seems like I will compile TF about 1-2 months on my notebook

test1230-lab commented 3 years ago

ask on telegram, im sure someone will have a decent cpu

On Fri, Oct 9, 2020 at 11:09 AM iperov notifications@github.com wrote:

seems like I will compile TF about 1-2 months on my notebook

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/iperov/DeepFaceLab/issues/906#issuecomment-706237310, or unsubscribe https://github.com/notifications/unsubscribe-auth/APLSDRBZ3VXSW6QEJCCS3QLSJ4RQ3ANCNFSM4RVGUCEQ .