Open megascomnenus opened 3 years ago
can you post screenshot of nvidia-smi
can you post screenshot of nvidia-smi
Idle state
Ethereum mining state
seems like RTX 3k breaks backward compatibility with CUDA 9.2
Could dfl not run on a newer version of cuda?
You need Cuda11+TensorFlow1.15.2 nv version。 I successfully run DFL on A100, 3080 is the same architecture
Ok so the "binary" must be updated to replace the old included cuda version
You need Cuda11+TensorFlow1.15.2 nv version。 I successfully run DFL on A100, 3080 is the same architecture
Thank you for answer.
CUDA 11 + TensorFlow 2.3.0 CUDA 11 + TensorFlow 1.15.2 CUDA 10.1 + Tensorflow 2.3.0
I installed it like this, but it didn't work. I don't know which one is the problem.
If other 3080 users test this, we'll see what's wrong.
did you run with the bat files? also dfl is not tf 2.x
try this build https://mega.nz/file/WgVX3QZb#mM-4gY87qWHrLON6SbJfeBUmRmNZlhaHOOJWWt-aV3k
The picture above is the version currently installed. CUDA 11, Tensorflow 1.15.2
I downloaded the link you gave and tried, but it doesn't work.
I uploaded a video trying to extract a face. https://youtu.be/8HvWs9ZaDpQ
Special Tensorflow 1.15.2 version is required, compiled by NVIDIA ,This is very important . I use Ubuntu 18.04. If you use DFL window build , you need to replace TF version and cuda cudnn DLL
Special Tensorflow 1.15.2 version is required, compiled by NVIDIA ,This is very important . I use Ubuntu 18.04. If you use DFL window build , you need to replace TF version and cuda cudnn DLL
From what I look for, there seems to be no Windows version of'tensorflow 1.15.2+nv'.
https://developer.nvidia.com/embedded/downloads#?search=tensorflow I found it on this link, but couldn't find the windows version.
Instead, I installed CUDA 10, cuDNN 7.6.5, TensorFlow 1.15.2.
gpu recognition in TensorFlow comes out as follows.
Python 3.6.5 (v3.6.5:f59c0932b4, Mar 28 2018, 17:00:18) [MSC v.1900 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information.
import tensorflow as tf 2020-09-25 06:20:26.405668: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll tf.version '1.15.2' from tensorflow.python.client import device_lib device_lib.list_local_devices() 2020-09-25 06:20:33.028850: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 2020-09-25 06:20:33.036937: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll 2020-09-25 06:20:33.067991: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1639] Found device 0 with properties: name: GeForce RTX 3080 major: 8 minor: 6 memoryClockRate(GHz): 1.74 pciBusID: 0000:26:00.0 2020-09-25 06:20:33.075671: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll 2020-09-25 06:20:33.083759: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll 2020-09-25 06:20:33.092244: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cufft64_100.dll 2020-09-25 06:20:33.098834: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library curand64_100.dll 2020-09-25 06:20:33.107826: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusolver64_100.dll 2020-09-25 06:20:33.115569: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cusparse64_100.dll 2020-09-25 06:20:33.127998: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudnn64_7.dll 2020-09-25 06:20:33.134070: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1767] Adding visible gpu devices: 0 2020-09-25 06:20:34.029890: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1180] Device interconnect StreamExecutor with strength 1 edge matrix: 2020-09-25 06:20:34.035905: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1186] 0 2020-09-25 06:20:34.040880: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1199] 0: N 2020-09-25 06:20:34.044904: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1325] Created TensorFlow device (/device:GPU:0 with 7844 MB memory) -> physical GPU (device: 0, name: GeForce RTX 3080, pci bus id: 0000:26:00.0, compute capability: 8.6) [name: "/device:CPU:0" device_type: "CPU" memory_limit: 268435456 locality { } incarnation: 13241750559676449420 , name: "/device:GPU:0" device_type: "GPU" memory_limit: 8225635697 locality { bus_id: 1 links { } } incarnation: 13019861911155521269 physical_device_desc: "device: 0, name: GeForce RTX 3080, pci bus id: 0000:26:00.0, compute capability: 8.6" ]
try this https://mega.nz/file/WskhTaqJ#nqPU7cnfV3QZnBt3PgQSZ7F1lV_gCyK6F5i0ArItRuA
Still doesn't work.
last build is tf 15.2 cuda 10 cudnn 7.6.0 version is the same as used in tf 15.2 compilation
last build is tf 15.2 cuda 10 cudnn 7.6.0 version is the same as used in tf 15.2 compilation
then CUDA 10 (not 10.1) TensorFlow 1.15.2 (not nv version) cuDNN 7.6.0 Is it correct to install?
I changed to cuDNN 7.6.0, but it doesn't work.
you don't need to install anything when using DFL builds.
you don't need to install anything when using DFL builds.
if so, can I stop trying and wait for the DFL to be updated?
currently there is no solution how to run 3080 with DFL. Because tensorflow doesn't support it.
currently there is no solution how to run 3080 with DFL. Because tensorflow doesn't support it.
Then I have to wait until TensorFlow supports DFL to operate RTX 3080. Thanks for your answer.
how about the version of tf @dream80 mentioned?
how about the version of tf @dream80 mentioned?
That version, as far as I know, is for Linux only.
I can't use it because I use Windows.
您需要Cuda11 + TensorFlow1.15.2 nv版本。我在A100上成功运行DFL,3080是相同的架构
Can you share your DFL? Thank you
I found that the RTX 3000 only works properly with CUDA 11.1, which was updated on Sep, 23.
But, as far as I know, even the latest version of TensorFlow, version 2.3.0, supports only CUDA 10.
I hope tensorflow 2.4.0 is newly released and DFL supports this.
https://news.developer.nvidia.com/cuda-11-1-introduces-support-rtx-30-series/
I'm not sure, but there are some sayings that TensorFlow 1.15 can support cuda 11.1. https://news.developer.nvidia.com/developer-blog-accelerating-tensorflow-on-nvidia-a100-gpus/
He succeeded in running Tensorflow 1.15 with the RTX 3000. https://www.pugetsystems.com/labs/hpc/RTX3090-TensorFlow-NAMD-and-HPCG-Performance-on-Linux-Preliminary-1902/
@megascomnenus thx, I will test tf 1.15 with cuda 11.1 right now with rtx 2K
works
Now somebody test with RTX 3k
https://mega.nz/file/Pgs0jKrI#ptg3INE95knjWVC4XDmAp7-VlWO6l2gE2xoQfhtftIM
works
Now somebody test with RTX 3k
https://mega.nz/file/Pgs0jKrI#ptg3INE95knjWVC4XDmAp7-VlWO6l2gE2xoQfhtftIM
I am testing now. It's slow, but it works.
test saehd iteration time
Running trainer.
[new] No saved models found. Enter a name of a new model : new
Model first run.
Choose one or several GPU idxs (separated by comma).
[CPU] : CPU [0] : GeForce RTX 3080
[0] Which GPU indexes to choose? : 0
[0] Autobackup every N hour ( 0..24 ?:help ) : 0 [n] Write preview history ( y/n ?:help ) : n [0] Target iteration : 0 [y] Flip faces randomly ( y/n ?:help ) : y [8] Batch_size ( ?:help ) : 8 [128] Resolution ( 64-640 ?:help ) : 128 [f] Face type ( h/mf/f/wf/head ?:help ) : f [df] AE architecture ( ?:help ) : df [256] AutoEncoder dimensions ( 32-1024 ?:help ) : 256 [64] Encoder dimensions ( 16-256 ?:help ) : 64 [64] Decoder dimensions ( 16-256 ?:help ) : 64 [22] Decoder mask dimensions ( 16-256 ?:help ) : 22 [n] Eyes priority ( y/n ?:help ) : n [n] Uniform yaw distribution of samples ( y/n ?:help ) : n [y] Place models and optimizer on GPU ( y/n ?:help ) : y [n] Use learning rate dropout ( n/y/cpu ?:help ) : n [y] Enable random warp of samples ( y/n ?:help ) : y [0.0] GAN power ( 0.0 .. 10.0 ?:help ) : 0.0 [0.0] 'True face' power. ( 0.0000 .. 1.0 ?:help ) : 0.0 [0.0] Face style power ( 0.0..100.0 ?:help ) : 0.0 [0.0] Background style power ( 0.0..100.0 ?:help ) : 0.0 [none] Color transfer for src faceset ( none/rct/lct/mkl/idt/sot ?:help ) : none [n] Enable gradient clipping ( y/n ?:help ) : n [n] Enable pretraining mode ( y/n ?:help ) : n Initializing models: 0%| | 0/5 [00:00<?, ?it/s] Error: Cannot assign a device for operation encoder/down1/downs_0/conv1/weight/Initializer/cai: Could not satisfy explicit device specification '' because the node node encoder/down1/downs_0/conv1/weight/Initializer/cai (defined at C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\framework\ops.py:1748) placed on device Device assignments active during op 'encoder/down1/downs_0/conv1/weight/Initializer/cai' creation: with tf.device(None): <C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\variables.py:1816> with tf.device(/GPU:0): <C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\models\Model_SAEHD\Model.py:233> was colocated with a group of nodes that required incompatible device '/device:GPU:0'. All available devices [/job:localhost/replica:0/task:0/device:CPU:0]. Colocation Debug Info: Colocation group had the following types and supported devices: Root Member(assigned_device_nameindex=-1 requested_devicename='/device:GPU:0' assigned_devicename='' resource_devicename='/device:GPU:0' supported_devicetypes=[CPU] possibledevices=[] Assign: CPU Const: CPU Fill: CPU VariableV2: CPU Identity: CPU
Colocation members, user-requested devices, and framework assigned devices, if any: encoder/down1/downs_0/conv1/weight/Initializer/cai/shape_as_tensor (Const) encoder/down1/downs_0/conv1/weight/Initializer/cai/Const (Const) encoder/down1/downs_0/conv1/weight/Initializer/cai (Fill) encoder/down1/downs_0/conv1/weight (VariableV2) /device:GPU:0 encoder/down1/downs_0/conv1/weight/Assign (Assign) /device:GPU:0 encoder/down1/downs_0/conv1/weight/read (Identity) /device:GPU:0 Assign_1 (Assign) /device:GPU:0
[[node encoder/down1/downs_0/conv1/weight/Initializer/_cai_ (defined at C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA\_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\framework\ops.py:1748) ]]Additional information about colocations:No node-device colocations were active during op 'encoder/down1/downs_0/conv1/weight/Initializer/_cai_' creation.
Device assignments active during op 'encoder/down1/downs_0/conv1/weight/Initializer/cai' creation: with tf.device(None): <C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\variables.py:1816> with tf.device(/GPU:0): <C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\models\Model_SAEHD\Model.py:233>
Original stack trace for 'encoder/down1/downs_0/conv1/weight/Initializer/cai':
File "threading.py", line 884, in _bootstrap
File "threading.py", line 916, in _bootstrap_inner
File "threading.py", line 864, in run
File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\mainscripts\Trainer.py", line 57, in trainerThread
debug=debug,
File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\models\ModelBase.py", line 189, in init
self.on_initialize()
File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\models\Model_SAEHD\Model.py", line 236, in on_initialize
encoder_out_ch = self.encoder.compute_output_channels ( (nn.floatx, bgr_shape))
File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\models\ModelBase.py", line 147, in compute_output_channels
shape = self.compute_output_shape(shapes)
File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\models\ModelBase.py", line 121, in compute_output_shape
self.build()
File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\models\ModelBase.py", line 65, in build
self._build_sub(v[name],name)
File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\models\ModelBase.py", line 35, in _build_sub
layer.build()
File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\models\ModelBase.py", line 65, in build
self._build_sub(v[name],name)
File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\models\ModelBase.py", line 20, in _build_sub
self._buildsub(sublayer, f"{name}{i}")
File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\models\ModelBase.py", line 35, in _build_sub
layer.build()
File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\models\ModelBase.py", line 65, in build
self._build_sub(v[name],name)
File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\models\ModelBase.py", line 33, in _build_sub
layer.build_weights()
File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\layers\Conv2D.py", line 76, in build_weights
self.weight = tf.get_variable("weight", (self.kernel_size,self.kernel_size,self.in_ch,self.out_ch), dtype=self.dtype, initializer=kernel_initializer, trainable=self.trainable )
File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\variable_scope.py", line 1500, in get_variable
aggregation=aggregation)
File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\variable_scope.py", line 1243, in get_variable
aggregation=aggregation)
File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\variable_scope.py", line 567, in get_variable
aggregation=aggregation)
File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\variable_scope.py", line 519, in _true_getter
aggregation=aggregation)
File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\variable_scope.py", line 933, in _get_single_variable
aggregation=aggregation)
File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\variables.py", line 258, in call
return cls._variable_v1_call(*args, kwargs)
File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\variables.py", line 219, in _variable_v1_call
shape=shape)
File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\variables.py", line 197, in
Traceback (most recent call last): File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\client\session.py", line 1365, in _do_call return fn(*args) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\client\session.py", line 1348, in _run_fn self._extend_graph() File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\client\session.py", line 1388, in _extend_graph tf_session.ExtendSession(self._session) tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot assign a device for operation encoder/down1/downs_0/conv1/weight/Initializer/cai: Could not satisfy explicit device specification '' because the node {{colocation_node encoder/down1/downs_0/conv1/weight/Initializer/cai}} was colocated with a group of nodes that required incompatible device '/device:GPU:0'. All available devices [/job:localhost/replica:0/task:0/device:CPU:0]. Colocation Debug Info: Colocation group had the following types and supported devices: Root Member(assigned_device_nameindex=-1 requested_devicename='/device:GPU:0' assigned_devicename='' resource_devicename='/device:GPU:0' supported_devicetypes=[CPU] possibledevices=[] Assign: CPU Const: CPU Fill: CPU VariableV2: CPU Identity: CPU
Colocation members, user-requested devices, and framework assigned devices, if any: encoder/down1/downs_0/conv1/weight/Initializer/cai/shape_as_tensor (Const) encoder/down1/downs_0/conv1/weight/Initializer/cai/Const (Const) encoder/down1/downs_0/conv1/weight/Initializer/cai (Fill) encoder/down1/downs_0/conv1/weight (VariableV2) /device:GPU:0 encoder/down1/downs_0/conv1/weight/Assign (Assign) /device:GPU:0 encoder/down1/downs_0/conv1/weight/read (Identity) /device:GPU:0 Assign_1 (Assign) /device:GPU:0
[[{{node encoder/down1/downs_0/conv1/weight/Initializer/_cai_}}]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\mainscripts\Trainer.py", line 57, in trainerThread debug=debug, File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\models\ModelBase.py", line 189, in init self.on_initialize() File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\models\Model_SAEHD\Model.py", line 571, in on_initialize model.init_weights() File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\layers\Saveable.py", line 101, in init_weights nn.init_weights(self.get_weights()) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\ops__init__.py", line 48, in init_weights nn.tf_sess.run (ops) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\client\session.py", line 956, in run run_metadata_ptr) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\client\session.py", line 1180, in _run feed_dict_tensor, options, run_metadata) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\client\session.py", line 1359, in _do_run run_metadata) File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\client\session.py", line 1384, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot assign a device for operation encoder/down1/downs_0/conv1/weight/Initializer/cai: Could not satisfy explicit device specification '' because the node node encoder/down1/downs_0/conv1/weight/Initializer/cai (defined at C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\framework\ops.py:1748) placed on device Device assignments active during op 'encoder/down1/downs_0/conv1/weight/Initializer/cai' creation: with tf.device(None): <C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\variables.py:1816> with tf.device(/GPU:0): <C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\models\Model_SAEHD\Model.py:233> was colocated with a group of nodes that required incompatible device '/device:GPU:0'. All available devices [/job:localhost/replica:0/task:0/device:CPU:0]. Colocation Debug Info: Colocation group had the following types and supported devices: Root Member(assigned_device_nameindex=-1 requested_devicename='/device:GPU:0' assigned_devicename='' resource_devicename='/device:GPU:0' supported_devicetypes=[CPU] possibledevices=[] Assign: CPU Const: CPU Fill: CPU VariableV2: CPU Identity: CPU
Colocation members, user-requested devices, and framework assigned devices, if any: encoder/down1/downs_0/conv1/weight/Initializer/cai/shape_as_tensor (Const) encoder/down1/downs_0/conv1/weight/Initializer/cai/Const (Const) encoder/down1/downs_0/conv1/weight/Initializer/cai (Fill) encoder/down1/downs_0/conv1/weight (VariableV2) /device:GPU:0 encoder/down1/downs_0/conv1/weight/Assign (Assign) /device:GPU:0 encoder/down1/downs_0/conv1/weight/read (Identity) /device:GPU:0 Assign_1 (Assign) /device:GPU:0
[[node encoder/down1/downs_0/conv1/weight/Initializer/_cai_ (defined at C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA\_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\framework\ops.py:1748) ]]Additional information about colocations:No node-device colocations were active during op 'encoder/down1/downs_0/conv1/weight/Initializer/_cai_' creation.
Device assignments active during op 'encoder/down1/downs_0/conv1/weight/Initializer/cai' creation: with tf.device(None): <C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\variables.py:1816> with tf.device(/GPU:0): <C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\models\Model_SAEHD\Model.py:233>
Original stack trace for 'encoder/down1/downs_0/conv1/weight/Initializer/cai':
File "threading.py", line 884, in _bootstrap
File "threading.py", line 916, in _bootstrap_inner
File "threading.py", line 864, in run
File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\mainscripts\Trainer.py", line 57, in trainerThread
debug=debug,
File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\models\ModelBase.py", line 189, in init
self.on_initialize()
File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\models\Model_SAEHD\Model.py", line 236, in on_initialize
encoder_out_ch = self.encoder.compute_output_channels ( (nn.floatx, bgr_shape))
File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\models\ModelBase.py", line 147, in compute_output_channels
shape = self.compute_output_shape(shapes)
File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\models\ModelBase.py", line 121, in compute_output_shape
self.build()
File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\models\ModelBase.py", line 65, in build
self._build_sub(v[name],name)
File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\models\ModelBase.py", line 35, in _build_sub
layer.build()
File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\models\ModelBase.py", line 65, in build
self._build_sub(v[name],name)
File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\models\ModelBase.py", line 20, in _build_sub
self._buildsub(sublayer, f"{name}{i}")
File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\models\ModelBase.py", line 35, in _build_sub
layer.build()
File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\models\ModelBase.py", line 65, in build
self._build_sub(v[name],name)
File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\models\ModelBase.py", line 33, in _build_sub
layer.build_weights()
File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\layers\Conv2D.py", line 76, in build_weights
self.weight = tf.get_variable("weight", (self.kernel_size,self.kernel_size,self.in_ch,self.out_ch), dtype=self.dtype, initializer=kernel_initializer, trainable=self.trainable )
File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\variable_scope.py", line 1500, in get_variable
aggregation=aggregation)
File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\variable_scope.py", line 1243, in get_variable
aggregation=aggregation)
File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\variable_scope.py", line 567, in get_variable
aggregation=aggregation)
File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\variable_scope.py", line 519, in _true_getter
aggregation=aggregation)
File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\variable_scope.py", line 933, in _get_single_variable
aggregation=aggregation)
File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\variables.py", line 258, in call
return cls._variable_v1_call(*args, kwargs)
File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\variables.py", line 219, in _variable_v1_call
shape=shape)
File "C:\Users\user\Documents\MEGAsync Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\variables.py", line 197, in
I tried on my RTX 3090
Running trainer.
[new] No saved models found. Enter a name of a new model : new
Model first run.
Choose one or several GPU idxs (separated by comma).
[CPU] : CPU [0] : GeForce RTX 3090
[0] Which GPU indexes to choose? : 0
Starting. Press "Enter" to stop training and save model.
Trying to do the first iteration. If an error occurs, reduce the model parameters.
2020-09-28 04:16:54.919484: E tensorflow/stream_executor/cuda/cuda_blas.cc:428] failed to run cuBLAS routine: CUBLAS_STATUS_EXECUTION_FAILED Error: Blas GEMM launch failed : a.shape=(8, 256), b.shape=(8, 16384), m=256, n=16384, k=8 [[node gradients/MatMul_3_grad/MatMul_1 (defined at C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\framework\ops.py:1748) ]]
Original stack trace for 'gradients/MatMul_3_grad/MatMul_1':
File "threading.py", line 884, in _bootstrap
File "threading.py", line 916, in _bootstrap_inner
File "threading.py", line 864, in run
File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\mainscripts\Trainer.py", line 57, in trainerThread
debug=debug,
File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\models\ModelBase.py", line 189, in init
self.on_initialize()
File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\models\Model_SAEHD\Model.py", line 471, in on_initialize
gpu_G_loss_gvs += [ nn.gradients ( gpu_G_loss, self.src_dst_trainable_weights ) ]
File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\ops__init.py", line 55, in tf_gradients
grads = gradients.gradients(loss, vars, colocate_gradients_with_ops=True )
File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\gradients_impl.py", line 158, in gradients
unconnected_gradients)
File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\gradients_util.py", line 679, in _GradientsHelper
lambda: grad_fn(op, out_grads))
File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\gradients_util.py", line 350, in _MaybeCompile
return grad_fn() # Exit early
File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\gradients_util.py", line 679, in
...which was originally created as op 'MatMul_3', defined at: File "threading.py", line 884, in _bootstrap [elided 3 identical lines from previous traceback] File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\models\ModelBase.py", line 189, in init self.on_initialize() File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\models\Model_SAEHD\Model.py", line 335, in on_initialize gpu_src_code = self.inter(self.encoder(gpu_warped_src)) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\models\ModelBase.py", line 117, in call return self.forward(*args, kwargs) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\archis\DeepFakeArchi.py", line 102, in forward x = self.dense2(x) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\layers\LayerBase.py", line 14, in call return self.forward(*args, *kwargs) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\layers\Dense.py", line 66, in forward x = tf.matmul(x, weight) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\util\dispatch.py", line 180, in wrapper return target(args, kwargs) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\math_ops.py", line 2754, in matmul a, b, transpose_a=transpose_a, transpose_b=transpose_b, name=name) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\gen_math_ops.py", line 6136, in mat_mul name=name)
Traceback (most recent call last): File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\client\session.py", line 1365, in _do_call return fn(*args) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\client\session.py", line 1350, in _run_fn target_list, run_metadata) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\client\session.py", line 1443, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.InternalError: Blas GEMM launch failed : a.shape=(8, 256), b.shape=(8, 16384), m=256, n=16384, k=8 [[{{node gradients/MatMul_3_grad/MatMul_1}}]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\mainscripts\Trainer.py", line 123, in trainerThread iter, iter_time = model.train_one_iter() File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\models\ModelBase.py", line 462, in train_one_iter losses = self.onTrainOneIter() File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\models\Model_SAEHD\Model.py", line 636, in onTrainOneIter src_loss, dst_loss = self.src_dst_train (warped_src, target_src, target_srcm_all, warped_dst, target_dst, target_dstm_all) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\models\Model_SAEHD\Model.py", line 503, in src_dst_train self.target_dstm_all:target_dstm_all, File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\client\session.py", line 956, in run run_metadata_ptr) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\client\session.py", line 1180, in _run feed_dict_tensor, options, run_metadata) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\client\session.py", line 1359, in _do_run run_metadata) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\client\session.py", line 1384, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.InternalError: Blas GEMM launch failed : a.shape=(8, 256), b.shape=(8, 16384), m=256, n=16384, k=8 [[node gradients/MatMul_3_grad/MatMul_1 (defined at C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\framework\ops.py:1748) ]]
Original stack trace for 'gradients/MatMul_3_grad/MatMul_1':
File "threading.py", line 884, in _bootstrap
File "threading.py", line 916, in _bootstrap_inner
File "threading.py", line 864, in run
File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\mainscripts\Trainer.py", line 57, in trainerThread
debug=debug,
File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\models\ModelBase.py", line 189, in init
self.on_initialize()
File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\models\Model_SAEHD\Model.py", line 471, in on_initialize
gpu_G_loss_gvs += [ nn.gradients ( gpu_G_loss, self.src_dst_trainable_weights ) ]
File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\ops__init.py", line 55, in tf_gradients
grads = gradients.gradients(loss, vars, colocate_gradients_with_ops=True )
File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\gradients_impl.py", line 158, in gradients
unconnected_gradients)
File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\gradients_util.py", line 679, in _GradientsHelper
lambda: grad_fn(op, out_grads))
File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\gradients_util.py", line 350, in _MaybeCompile
return grad_fn() # Exit early
File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\gradients_util.py", line 679, in
...which was originally created as op 'MatMul_3', defined at: File "threading.py", line 884, in _bootstrap [elided 3 identical lines from previous traceback] File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\models\ModelBase.py", line 189, in init self.on_initialize() File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\models\Model_SAEHD\Model.py", line 335, in on_initialize gpu_src_code = self.inter(self.encoder(gpu_warped_src)) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\models\ModelBase.py", line 117, in call return self.forward(*args, kwargs) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\archis\DeepFakeArchi.py", line 102, in forward x = self.dense2(x) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\layers\LayerBase.py", line 14, in call return self.forward(*args, *kwargs) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\DeepFaceLab\core\leras\layers\Dense.py", line 66, in forward x = tf.matmul(x, weight) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\util\dispatch.py", line 180, in wrapper return target(args, kwargs) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\math_ops.py", line 2754, in matmul a, b, transpose_a=transpose_a, transpose_b=transpose_b, name=name) File "C:\Users\USER\Downloads\DeepFaceLab_NVIDIA_internal\python-3.6.8\lib\site-packages\tensorflow_core\python\ops\gen_math_ops.py", line 6136, in mat_mul name=name)
@iperov The last build from 27.09.20 (which probably built for 3000) on RTX 2080 doesn't work "faceset extract". If you choose RTX 2080 it starts to works on the CPU only, and ignore the GPU.
seems like it does not work. tf 1.15 compiled with cuda 10 So cuda 11.1 dlls are not accepted
@dream80 how did you install the special tf version on linux?
tf 2.4.0 nightly version is constantly being updated and the latest version is 10.03.20 (tf-nightly is a beta test version) https://pypi.org/project/tf-nightly-gpu/
It is not clear if it supports rtx3000, but it is stated that it supports cuda 11. Also, I saw a post saying that cuda 11.1 + cudnn-8.0.4 + nv driver 456.43 + python3.7 or 3.8 + tf 2.4.0 dev succeeded. https://blog.csdn.net/tophuihui/article/details/108896615
Also, it seems that Linux can run even if it is not tf 2.4.0. If you are in a hurry, you can use Linux.
tf nightly is compiled with cuda 10.1.
Here are other problems with running rtx3 in windows.
tf nightly is compiled with cuda 10.1.
Here are other problems with running rtx3 in windows.
Doesn't this mean it supports cuda 11?
And he succeeded rtx3080 and 3090 with tf 2.4.0. Run the translator and check. 'win10+Tensorflow + cuda +RTX 3090/3080 +cudnn' https://blog.csdn.net/tophuihui/article/details/108896615
Recently, the RTX3080 / 3090 has been released, greatly increasing the power of deep learning computing. I started testing as soon as possible, and it's actually a hydrogen bomb level! When I translate the article, the official version (tf 2.3.0) does not support cuda 11. The latest cuda11, cudnn, graphics card driver and tensorflow versions all require a one-to-one correspondence. 1.Install Cuda: cuda 11.1 + cudnn-v8.0.4
theese are only for linux, not windows.
I don't trust such chinesse pages. They are like from other planet.
he uses CPU version.
He said he wrote it wrong. It’s GPU.
anyone tried and proven this to work?
For Linux🐧 users,
Nvidia provides a docker container image including Tensorflow 1.15.x and CUDA 11.1. It seems to work with my old GPU (RTX 2080). Please try this container image if you have 3000 series.
You need to install the latest Docker and NVIDIA Container Toolkit. https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#docker
docker pull nvcr.io/nvidia/tensorflow:20.09-tf1-py3
docker run -it --gpus all -v /home/foobar/data:/data nvcr.io/nvidia/tensorflow:20.09-tf1-py3 /bin/bash
# Tips
# 1. Do not install Tensorflow via pip. It has been already installed. Edit requirements-cuda.txt before installing deps.
For Linux🐧 users,
Nvidia provides a docker container image including Tensorflow 1.15.x and CUDA 11.1. It seems to work with my old GPU (RTX 2080). Please try this container image if you have 3000 series.
You need to install the latest Docker and NVIDIA Container Toolkit. https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#docker
docker pull nvcr.io/nvidia/tensorflow:20.09-tf1-py3 docker run -it --gpus all -v /home/foobar/data:/data nvcr.io/nvidia/tensorflow:20.09-tf1-py3 /bin/bash # Tips # 1. Do not install Tensorflow via pip. It has been already installed. Edit requirements-cuda.txt before installing deps.
anyone tried if this works on DFL linux?
Still no solution for windows?
compilation will take 1-2 weeks on my second comp unless tf releases 2.4.0 for windows
@iperov u need CPU power?
Thank you for compiling iperov. I'm one of those who bought it but can't use it (I want to donate very much, but none of the payments from my country work)
@blanuk yes I need powerful CPU comp , 32gb ram and win 7/10
seems like I will compile TF about 1-2 months on my notebook
ask on telegram, im sure someone will have a decent cpu
On Fri, Oct 9, 2020 at 11:09 AM iperov notifications@github.com wrote:
seems like I will compile TF about 1-2 months on my notebook
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/iperov/DeepFaceLab/issues/906#issuecomment-706237310, or unsubscribe https://github.com/notifications/unsubscribe-auth/APLSDRBZ3VXSW6QEJCCS3QLSJ4RQ3ANCNFSM4RVGUCEQ .
My 3080 dosent't work on dfl.
It works just by using cpu.
I habe the lastest nvidia driver installed and DFL updated in August but it couldn't work.
It works good elsewhere. Ex) counter strike, valorant, lol, 3dmark, video
Even CUDA driver couldn't help me either.