chychkan / DeepFaceLab_MacOS

Run DeepFaceLab on MacOS
GNU General Public License v3.0
261 stars 62 forks source link

Can't train on GPU . Model Quick96 error #132

Open prashantspandey opened 1 year ago

prashantspandey commented 1 year ago

Choose one or several GPU idxs (separated by comma).

[CPU] : CPU [0] : METAL

[0] Which GPU indexes to choose? : 0

Metal device set to: Apple M2 Pro

systemMemory: 16.00 GB maxCacheSize: 5.33 GB

18 devices <core.leras.device.Devices object at 0x120f93dc0> GPU COUNT 1 gpu id 0 devices /CPU:0 Initializing models: 0%| | 0/5 [00:00<?, ?it/s]

Error: Graph execution error:

Detected at node 'Mean_2' defined at (most recent call last):
    File "/Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/threading.py", line 930, in _bootstrap
      self._bootstrap_inner()
    File "/Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/threading.py", line 973, in _bootstrap_inner
      self.run()
    File "/Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/threading.py", line 910, in run
      self._target(*self._args, **self._kwargs)
    File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/.dfl/DeepFaceLab/mainscripts/Trainer.py", line 46, in trainerThread
      model = models.import_model(model_class_name)(
    File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/.dfl/DeepFaceLab/models/ModelBase.py", line 193, in __init__
      self.on_initialize()
    File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/.dfl/DeepFaceLab/models/Model_Quick96/Model.py", line 143, in on_initialize
      gpu_src_loss =  tf.reduce_mean ( 10*nn.dssim(gpu_target_src_masked_opt, gpu_pred_src_src_masked_opt, max_val=1.0, filter_size=int(resolution/11.6)), axis=[1])
    File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/.dfl/DeepFaceLab/core/leras/ops/__init__.py", line 308, in dssim
      ssim_val = tf.reduce_mean(luminance * cs, axis=nn.conv2d_spatial_axes )
Node: 'Mean_2'
Cannot assign a device for operation Mean_2: Could not satisfy explicit device specification '/device:GPU:0' because no supported kernel for GPU devices is available.
Colocation Debug Info:
Colocation group had the following types and supported devices: 
Root Member(assigned_device_name_index_=-1 requested_device_name_='/device:GPU:0' assigned_device_name_='' resource_device_name_='' supported_device_types_=[CPU] possible_devices_=[]
FloorDiv: GPU CPU 
RealDiv: GPU CPU 
Maximum: GPU CPU 
Cast: GPU CPU 
FloorMod: GPU CPU 
BroadcastTo: GPU CPU 
Shape: GPU CPU 
Range: CPU 
DynamicStitch: CPU 
Reshape: GPU CPU 
Mean: GPU CPU 
Prod: GPU CPU 
AddV2: GPU CPU 
Const: GPU CPU 
Fill: GPU CPU 

Colocation members, user-requested devices, and framework assigned devices, if any:
  Mean_2 (Mean) /device:GPU:0
  gradients/Mean_2_grad/Shape (Shape) /device:GPU:0
  gradients/Mean_2_grad/Size (Const) /device:GPU:0
  gradients/Mean_2_grad/add (AddV2) /device:GPU:0
  gradients/Mean_2_grad/mod (FloorMod) /device:GPU:0
  gradients/Mean_2_grad/Shape_1 (Const) /device:GPU:0
  gradients/Mean_2_grad/range/start (Const) /device:GPU:0
  gradients/Mean_2_grad/range/delta (Const) /device:GPU:0
  gradients/Mean_2_grad/range (Range) /device:GPU:0
  gradients/Mean_2_grad/ones/Const (Const) /device:GPU:0
  gradients/Mean_2_grad/ones (Fill) /device:GPU:0
  gradients/Mean_2_grad/DynamicStitch (DynamicStitch) /device:GPU:0
  gradients/Mean_2_grad/Reshape (Reshape) /device:GPU:0
  gradients/Mean_2_grad/BroadcastTo (BroadcastTo) /device:GPU:0
  gradients/Mean_2_grad/Shape_2 (Shape) /device:GPU:0
  gradients/Mean_2_grad/Shape_3 (Shape) /device:GPU:0
  gradients/Mean_2_grad/Const (Const) /device:GPU:0
  gradients/Mean_2_grad/Prod (Prod) /device:GPU:0
  gradients/Mean_2_grad/Const_1 (Const) /device:GPU:0
  gradients/Mean_2_grad/Prod_1 (Prod) /device:GPU:0
  gradients/Mean_2_grad/Maximum/y (Const) /device:GPU:0
  gradients/Mean_2_grad/Maximum (Maximum) /device:GPU:0
  gradients/Mean_2_grad/floordiv (FloorDiv) /device:GPU:0
  gradients/Mean_2_grad/Cast (Cast) /device:GPU:0
  gradients/Mean_2_grad/truediv (RealDiv) /device:GPU:0

Op: Mean
Node attrs: Tidx=DT_INT32, keep_dims=false, T=DT_FLOAT
Registered kernels:
  device='GPU'; T in [DT_FLOAT]; Tidx in [DT_INT32]
  device='GPU'; T in [DT_FLOAT]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_UINT64]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_UINT64]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_INT64]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_INT64]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_UINT32]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_UINT32]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_UINT16]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_UINT16]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_INT16]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_INT16]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_UINT8]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_UINT8]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_INT8]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_INT8]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_INT32]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_INT32]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_HALF]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_HALF]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_BFLOAT16]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_BFLOAT16]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_FLOAT]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_FLOAT]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_DOUBLE]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_DOUBLE]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_COMPLEX64]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_COMPLEX64]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_COMPLEX128]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_COMPLEX128]; Tidx in [DT_INT64]

     [[{{node Mean_2}}]]

Original stack trace for 'Mean_2':
  File "/Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/threading.py", line 930, in _bootstrap
    self._bootstrap_inner()
  File "/Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/threading.py", line 973, in _bootstrap_inner
    self.run()
  File "/Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/threading.py", line 910, in run
    self._target(*self._args, **self._kwargs)
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/.dfl/DeepFaceLab/mainscripts/Trainer.py", line 46, in trainerThread
    model = models.import_model(model_class_name)(
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/.dfl/DeepFaceLab/models/ModelBase.py", line 193, in __init__
    self.on_initialize()
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/.dfl/DeepFaceLab/models/Model_Quick96/Model.py", line 143, in on_initialize
    gpu_src_loss =  tf.reduce_mean ( 10*nn.dssim(gpu_target_src_masked_opt, gpu_pred_src_src_masked_opt, max_val=1.0, filter_size=int(resolution/11.6)), axis=[1])
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/.dfl/DeepFaceLab/core/leras/ops/__init__.py", line 308, in dssim
    ssim_val = tf.reduce_mean(luminance * cs, axis=nn.conv2d_spatial_axes )
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/lib/python3.9/site-packages/tensorflow/python/util/traceback_utils.py", line 150, in error_handler
    return fn(*args, **kwargs)
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/lib/python3.9/site-packages/tensorflow/python/util/dispatch.py", line 1082, in op_dispatch_handler
    return dispatch_target(*args, **kwargs)
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/lib/python3.9/site-packages/tensorflow/python/ops/math_ops.py", line 2581, in reduce_mean_v1
    return reduce_mean(input_tensor, axis, keepdims, name)
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/lib/python3.9/site-packages/tensorflow/python/util/traceback_utils.py", line 150, in error_handler
    return fn(*args, **kwargs)
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/lib/python3.9/site-packages/tensorflow/python/util/dispatch.py", line 1082, in op_dispatch_handler
    return dispatch_target(*args, **kwargs)
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/lib/python3.9/site-packages/tensorflow/python/ops/math_ops.py", line 2639, in reduce_mean
    gen_math_ops.mean(
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/lib/python3.9/site-packages/tensorflow/python/ops/gen_math_ops.py", line 6286, in mean
    _, _, _op, _outputs = _op_def_library._apply_op_helper(
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/lib/python3.9/site-packages/tensorflow/python/framework/op_def_library.py", line 740, in _apply_op_helper
    op = g._create_op_internal(op_type_name, inputs, dtypes=None,
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/lib/python3.9/site-packages/tensorflow/python/framework/ops.py", line 3776, in _create_op_internal
    ret = Operation(
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/lib/python3.9/site-packages/tensorflow/python/framework/ops.py", line 2175, in __init__
    self._traceback = tf_stack.extract_stack_for_node(self._c_op)

Traceback (most recent call last):
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/lib/python3.9/site-packages/tensorflow/python/client/session.py", line 1377, in _do_call
    return fn(*args)
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/lib/python3.9/site-packages/tensorflow/python/client/session.py", line 1359, in _run_fn
    self._extend_graph()
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/lib/python3.9/site-packages/tensorflow/python/client/session.py", line 1400, in _extend_graph
    tf_session.ExtendSession(self._session)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot assign a device for operation Mean_2: Could not satisfy explicit device specification '/device:GPU:0' because no supported kernel for GPU devices is available.
Colocation Debug Info:
Colocation group had the following types and supported devices: 
Root Member(assigned_device_name_index_=-1 requested_device_name_='/device:GPU:0' assigned_device_name_='' resource_device_name_='' supported_device_types_=[CPU] possible_devices_=[]
FloorDiv: GPU CPU 
RealDiv: GPU CPU 
Maximum: GPU CPU 
Cast: GPU CPU 
FloorMod: GPU CPU 
BroadcastTo: GPU CPU 
Shape: GPU CPU 
Range: CPU 
DynamicStitch: CPU 
Reshape: GPU CPU 
Mean: GPU CPU 
Prod: GPU CPU 
AddV2: GPU CPU 
Const: GPU CPU 
Fill: GPU CPU 

Colocation members, user-requested devices, and framework assigned devices, if any:
  Mean_2 (Mean) /device:GPU:0
  gradients/Mean_2_grad/Shape (Shape) /device:GPU:0
  gradients/Mean_2_grad/Size (Const) /device:GPU:0
  gradients/Mean_2_grad/add (AddV2) /device:GPU:0
  gradients/Mean_2_grad/mod (FloorMod) /device:GPU:0
  gradients/Mean_2_grad/Shape_1 (Const) /device:GPU:0
  gradients/Mean_2_grad/range/start (Const) /device:GPU:0
  gradients/Mean_2_grad/range/delta (Const) /device:GPU:0
  gradients/Mean_2_grad/range (Range) /device:GPU:0
  gradients/Mean_2_grad/ones/Const (Const) /device:GPU:0
  gradients/Mean_2_grad/ones (Fill) /device:GPU:0
  gradients/Mean_2_grad/DynamicStitch (DynamicStitch) /device:GPU:0
  gradients/Mean_2_grad/Reshape (Reshape) /device:GPU:0
  gradients/Mean_2_grad/BroadcastTo (BroadcastTo) /device:GPU:0
  gradients/Mean_2_grad/Shape_2 (Shape) /device:GPU:0
  gradients/Mean_2_grad/Shape_3 (Shape) /device:GPU:0
  gradients/Mean_2_grad/Const (Const) /device:GPU:0
  gradients/Mean_2_grad/Prod (Prod) /device:GPU:0
  gradients/Mean_2_grad/Const_1 (Const) /device:GPU:0
  gradients/Mean_2_grad/Prod_1 (Prod) /device:GPU:0
  gradients/Mean_2_grad/Maximum/y (Const) /device:GPU:0
  gradients/Mean_2_grad/Maximum (Maximum) /device:GPU:0
  gradients/Mean_2_grad/floordiv (FloorDiv) /device:GPU:0
  gradients/Mean_2_grad/Cast (Cast) /device:GPU:0
  gradients/Mean_2_grad/truediv (RealDiv) /device:GPU:0

Op: Mean
Node attrs: Tidx=DT_INT32, keep_dims=false, T=DT_FLOAT
Registered kernels:
  device='GPU'; T in [DT_FLOAT]; Tidx in [DT_INT32]
  device='GPU'; T in [DT_FLOAT]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_UINT64]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_UINT64]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_INT64]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_INT64]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_UINT32]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_UINT32]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_UINT16]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_UINT16]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_INT16]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_INT16]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_UINT8]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_UINT8]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_INT8]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_INT8]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_INT32]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_INT32]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_HALF]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_HALF]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_BFLOAT16]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_BFLOAT16]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_FLOAT]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_FLOAT]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_DOUBLE]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_DOUBLE]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_COMPLEX64]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_COMPLEX64]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_COMPLEX128]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_COMPLEX128]; Tidx in [DT_INT64]

     [[{{node Mean_2}}]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/.dfl/DeepFaceLab/mainscripts/Trainer.py", line 46, in trainerThread
    model = models.import_model(model_class_name)(
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/.dfl/DeepFaceLab/models/ModelBase.py", line 193, in __init__
    self.on_initialize()
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/.dfl/DeepFaceLab/models/Model_Quick96/Model.py", line 227, in on_initialize
    model.init_weights()
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/.dfl/DeepFaceLab/core/leras/layers/Saveable.py", line 106, in init_weights
    nn.init_weights(self.get_weights())
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/.dfl/DeepFaceLab/core/leras/ops/__init__.py", line 48, in init_weights
    nn.tf_sess.run (ops)
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/lib/python3.9/site-packages/tensorflow/python/client/session.py", line 967, in run
    result = self._run(None, fetches, feed_dict, options_ptr,
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/lib/python3.9/site-packages/tensorflow/python/client/session.py", line 1190, in _run
    results = self._do_run(handle, final_targets, final_fetches,
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/lib/python3.9/site-packages/tensorflow/python/client/session.py", line 1370, in _do_run
    return self._do_call(_run_fn, feeds, fetches, targets, options,
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/lib/python3.9/site-packages/tensorflow/python/client/session.py", line 1396, in _do_call
    raise type(e)(node_def, op, message)  # pylint: disable=no-value-for-parameter
tensorflow.python.framework.errors_impl.InvalidArgumentError: Graph execution error:

Detected at node 'Mean_2' defined at (most recent call last):
    File "/Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/threading.py", line 930, in _bootstrap
      self._bootstrap_inner()
    File "/Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/threading.py", line 973, in _bootstrap_inner
      self.run()
    File "/Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/threading.py", line 910, in run
      self._target(*self._args, **self._kwargs)
    File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/.dfl/DeepFaceLab/mainscripts/Trainer.py", line 46, in trainerThread
      model = models.import_model(model_class_name)(
    File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/.dfl/DeepFaceLab/models/ModelBase.py", line 193, in __init__
      self.on_initialize()
    File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/.dfl/DeepFaceLab/models/Model_Quick96/Model.py", line 143, in on_initialize
      gpu_src_loss =  tf.reduce_mean ( 10*nn.dssim(gpu_target_src_masked_opt, gpu_pred_src_src_masked_opt, max_val=1.0, filter_size=int(resolution/11.6)), axis=[1])
    File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/.dfl/DeepFaceLab/core/leras/ops/__init__.py", line 308, in dssim
      ssim_val = tf.reduce_mean(luminance * cs, axis=nn.conv2d_spatial_axes )
Node: 'Mean_2'
Cannot assign a device for operation Mean_2: Could not satisfy explicit device specification '/device:GPU:0' because no supported kernel for GPU devices is available.
Colocation Debug Info:
Colocation group had the following types and supported devices: 
Root Member(assigned_device_name_index_=-1 requested_device_name_='/device:GPU:0' assigned_device_name_='' resource_device_name_='' supported_device_types_=[CPU] possible_devices_=[]
FloorDiv: GPU CPU 
RealDiv: GPU CPU 
Maximum: GPU CPU 
Cast: GPU CPU 
FloorMod: GPU CPU 
BroadcastTo: GPU CPU 
Shape: GPU CPU 
Range: CPU 
DynamicStitch: CPU 
Reshape: GPU CPU 
Mean: GPU CPU 
Prod: GPU CPU 
AddV2: GPU CPU 
Const: GPU CPU 
Fill: GPU CPU 

Colocation members, user-requested devices, and framework assigned devices, if any:
  Mean_2 (Mean) /device:GPU:0
  gradients/Mean_2_grad/Shape (Shape) /device:GPU:0
  gradients/Mean_2_grad/Size (Const) /device:GPU:0
  gradients/Mean_2_grad/add (AddV2) /device:GPU:0
  gradients/Mean_2_grad/mod (FloorMod) /device:GPU:0
  gradients/Mean_2_grad/Shape_1 (Const) /device:GPU:0
  gradients/Mean_2_grad/range/start (Const) /device:GPU:0
  gradients/Mean_2_grad/range/delta (Const) /device:GPU:0
  gradients/Mean_2_grad/range (Range) /device:GPU:0
  gradients/Mean_2_grad/ones/Const (Const) /device:GPU:0
  gradients/Mean_2_grad/ones (Fill) /device:GPU:0
  gradients/Mean_2_grad/DynamicStitch (DynamicStitch) /device:GPU:0
  gradients/Mean_2_grad/Reshape (Reshape) /device:GPU:0
  gradients/Mean_2_grad/BroadcastTo (BroadcastTo) /device:GPU:0
  gradients/Mean_2_grad/Shape_2 (Shape) /device:GPU:0
  gradients/Mean_2_grad/Shape_3 (Shape) /device:GPU:0
  gradients/Mean_2_grad/Const (Const) /device:GPU:0
  gradients/Mean_2_grad/Prod (Prod) /device:GPU:0
  gradients/Mean_2_grad/Const_1 (Const) /device:GPU:0
  gradients/Mean_2_grad/Prod_1 (Prod) /device:GPU:0
  gradients/Mean_2_grad/Maximum/y (Const) /device:GPU:0
  gradients/Mean_2_grad/Maximum (Maximum) /device:GPU:0
  gradients/Mean_2_grad/floordiv (FloorDiv) /device:GPU:0
  gradients/Mean_2_grad/Cast (Cast) /device:GPU:0
  gradients/Mean_2_grad/truediv (RealDiv) /device:GPU:0

Op: Mean
Node attrs: Tidx=DT_INT32, keep_dims=false, T=DT_FLOAT
Registered kernels:
  device='GPU'; T in [DT_FLOAT]; Tidx in [DT_INT32]
  device='GPU'; T in [DT_FLOAT]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_UINT64]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_UINT64]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_INT64]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_INT64]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_UINT32]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_UINT32]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_UINT16]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_UINT16]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_INT16]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_INT16]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_UINT8]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_UINT8]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_INT8]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_INT8]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_INT32]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_INT32]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_HALF]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_HALF]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_BFLOAT16]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_BFLOAT16]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_FLOAT]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_FLOAT]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_DOUBLE]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_DOUBLE]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_COMPLEX64]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_COMPLEX64]; Tidx in [DT_INT64]
  device='CPU'; T in [DT_COMPLEX128]; Tidx in [DT_INT32]
  device='CPU'; T in [DT_COMPLEX128]; Tidx in [DT_INT64]

     [[{{node Mean_2}}]]

Original stack trace for 'Mean_2':
  File "/Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/threading.py", line 930, in _bootstrap
    self._bootstrap_inner()
  File "/Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/threading.py", line 973, in _bootstrap_inner
    self.run()
  File "/Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/threading.py", line 910, in run
    self._target(*self._args, **self._kwargs)
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/.dfl/DeepFaceLab/mainscripts/Trainer.py", line 46, in trainerThread
    model = models.import_model(model_class_name)(
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/.dfl/DeepFaceLab/models/ModelBase.py", line 193, in __init__
    self.on_initialize()
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/.dfl/DeepFaceLab/models/Model_Quick96/Model.py", line 143, in on_initialize
    gpu_src_loss =  tf.reduce_mean ( 10*nn.dssim(gpu_target_src_masked_opt, gpu_pred_src_src_masked_opt, max_val=1.0, filter_size=int(resolution/11.6)), axis=[1])
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/.dfl/DeepFaceLab/core/leras/ops/__init__.py", line 308, in dssim
    ssim_val = tf.reduce_mean(luminance * cs, axis=nn.conv2d_spatial_axes )
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/lib/python3.9/site-packages/tensorflow/python/util/traceback_utils.py", line 150, in error_handler
    return fn(*args, **kwargs)
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/lib/python3.9/site-packages/tensorflow/python/util/dispatch.py", line 1082, in op_dispatch_handler
    return dispatch_target(*args, **kwargs)
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/lib/python3.9/site-packages/tensorflow/python/ops/math_ops.py", line 2581, in reduce_mean_v1
    return reduce_mean(input_tensor, axis, keepdims, name)
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/lib/python3.9/site-packages/tensorflow/python/util/traceback_utils.py", line 150, in error_handler
    return fn(*args, **kwargs)
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/lib/python3.9/site-packages/tensorflow/python/util/dispatch.py", line 1082, in op_dispatch_handler
    return dispatch_target(*args, **kwargs)
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/lib/python3.9/site-packages/tensorflow/python/ops/math_ops.py", line 2639, in reduce_mean
    gen_math_ops.mean(
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/lib/python3.9/site-packages/tensorflow/python/ops/gen_math_ops.py", line 6286, in mean
    _, _, _op, _outputs = _op_def_library._apply_op_helper(
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/lib/python3.9/site-packages/tensorflow/python/framework/op_def_library.py", line 740, in _apply_op_helper
    op = g._create_op_internal(op_type_name, inputs, dtypes=None,
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/lib/python3.9/site-packages/tensorflow/python/framework/ops.py", line 3776, in _create_op_internal
    ret = Operation(
  File "/Users/prashantpandey/Desktop/programming/deep_learning/dfl/DeepFaceLab_MacOS/lib/python3.9/site-packages/tensorflow/python/framework/ops.py", line 2175, in __init__
    self._traceback = tf_stack.extract_stack_for_node(self._c_op)

It shows that tf.reduce_mean is the problem , what can i do to solve his problem.

Pleasehelp

prashantspandey commented 1 year ago

I solved it by using the appropriate versions of tensorflow-macos==2.8.0 , tensorflow-metal==0.5.0 and numpy== 1.23. Now at-least the training starts but , loss doesn't go down. It just randomly calculates . Even in the preview, columns 2,4,5 which contain the learned outputs don't show up.

So definitely something wrong with the model as even on CPU the same problem of loss not going down persists.