Hello, I can use CPU-based tensorflow for inference and training, but the following error occurs when using Tensorflow-GPU, I have four 24GB 4090s, when I start inference or training, graphics card 0 will suddenly reach its limit, while the other graphics cards are largely unused, the code does not seem to be distributed training. I want to ask how much gpu-memory is needed to train this model?
2024-12-03 16:50:19.337225: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA
2024-12-03 16:50:19.433857: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 0 with properties:
name: NVIDIA GeForce RTX 4090 major: 8 minor: 9 memoryClockRate(GHz): 2.52
pciBusID: 0000:18:00.0
totalMemory: 23.65GiB freeMemory: 22.87GiB
2024-12-03 16:50:19.546432: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 1 with properties:
name: NVIDIA GeForce RTX 4090 major: 8 minor: 9 memoryClockRate(GHz): 2.52
pciBusID: 0000:3b:00.0
totalMemory: 23.65GiB freeMemory: 23.27GiB
2024-12-03 16:50:19.618871: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 2 with properties:
name: NVIDIA GeForce RTX 4090 major: 8 minor: 9 memoryClockRate(GHz): 2.52
pciBusID: 0000:86:00.0
totalMemory: 23.65GiB freeMemory: 23.27GiB
2024-12-03 16:50:19.686616: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 3 with properties:
name: NVIDIA GeForce RTX 4090 major: 8 minor: 9 memoryClockRate(GHz): 2.52
pciBusID: 0000:af:00.0
totalMemory: 23.65GiB freeMemory: 23.27GiB
2024-12-03 16:50:19.686669: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0, 1, 2, 3
2024-12-03 16:50:22.192435: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix:
2024-12-03 16:50:22.192472: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988] 0 1 2 3
2024-12-03 16:50:22.192478: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0: N N N N
2024-12-03 16:50:22.192482: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 1: N N N N
2024-12-03 16:50:22.192501: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 2: N N N N
2024-12-03 16:50:22.192505: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 3: N N N N
2024-12-03 16:50:22.192657: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 22164 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce RTX 4090, pci bus id: 0000:18:00.0, compute capability: 8.9)
2024-12-03 16:50:22.192978: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 22559 MB memory) -> physical GPU (device: 1, name: NVIDIA GeForce RTX 4090, pci bus id: 0000:3b:00.0, compute capability: 8.9)
2024-12-03 16:50:22.193198: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:2 with 22559 MB memory) -> physical GPU (device: 2, name: NVIDIA GeForce RTX 4090, pci bus id: 0000:86:00.0, compute capability: 8.9)
2024-12-03 16:50:22.193420: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:3 with 22559 MB memory) -> physical GPU (device: 3, name: NVIDIA GeForce RTX 4090, pci bus id: 0000:af:00.0, compute capability: 8.9)
2024-12-03 16:50:36.161998: E tensorflow/stream_executor/cuda/cuda_blas.cc:652] failed to run cuBLAS routine cublasSgemm_v2: CUBLAS_STATUS_EXECUTION_FAILED
Traceback (most recent call last):
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1334, in _do_call
return fn(*args)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1319, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InternalError: Blas GEMM launch failed : a.shape=(9, 9), b.shape=(9, 256), m=9, n=256, k=9
[[{{node MV_analysis/layer_0/signal_conv2d/kernel_rdft/Initializer/MatMul}} = MatMul[T=DT_FLOAT, _class=["loc:@MV_analysis/layer_0/signal_conv2d/kernel_rdft/Assign"], transpose_a=true, transpose_b=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](MV_analysis/layer_0/signal_conv2d/irdft_3x3, MV_analysis/layer_0/signal_conv2d/kernel_rdft/Initializer/Reshape)]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "OpenDVC_train_PSNR.py", line 123, in
sess.run(tf.global_variables_initializer())
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 929, in run
run_metadata_ptr)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1152, in _run
feed_dict_tensor, options, run_metadata)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1328, in _do_run
run_metadata)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1348, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InternalError: Blas GEMM launch failed : a.shape=(9, 9), b.shape=(9, 256), m=9, n=256, k=9
[[node MV_analysis/layer_0/signal_conv2d/kernel_rdft/Initializer/MatMul (defined at /home/user/zhangm/videoCompress/OpenDVC/tensorflow_compression/python/layers/parameterizers.py:106) = MatMul[T=DT_FLOAT, _class=["loc:@MV_analysis/layer_0/signal_conv2d/kernel_rdft/Assign"], transpose_a=true, transpose_b=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](MV_analysis/layer_0/signal_conv2d/irdft_3x3, MV_analysis/layer_0/signal_conv2d/kernel_rdft/Initializer/Reshape)]]
Caused by op 'MV_analysis/layer_0/signal_conv2d/kernel_rdft/Initializer/MatMul', defined at:
File "OpenDVC_train_PSNR.py", line 49, in
flow_latent = CNN_img.MV_analysis(flow_tensor, args.N, args.M)
File "/home/user/zhangm/videoCompress/OpenDVC/CNN_img.py", line 17, in MV_analysis
tensor = layer(tensor)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/layers/base.py", line 374, in call
outputs = super(Layer, self).call(inputs, *args, kwargs)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py", line 746, in call
self.build(input_shapes)
File "/home/user/zhangm/videoCompress/OpenDVC/tensorflow_compression/python/layers/signal_conv.py", line 411, in build
regularizer=self.kernel_regularizer)
File "/home/user/zhangm/videoCompress/OpenDVC/tensorflow_compression/python/layers/parameterizers.py", line 119, in call
initializer=rdft_initializer, regularizer=regularizer)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py", line 495, in add_variable
return self.add_weight(args, kwargs)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/layers/base.py", line 288, in add_weight
getter=vs.get_variable)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py", line 609, in add_weight
aggregation=aggregation)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/training/checkpointable/base.py", line 639, in _add_variable_with_custom_getter
kwargs_for_getter)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 1487, in get_variable
aggregation=aggregation)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 1237, in get_variable
aggregation=aggregation)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 540, in get_variable
aggregation=aggregation)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 492, in _true_getter
aggregation=aggregation)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 922, in _get_single_variable
aggregation=aggregation)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 183, in call
return cls._variable_v1_call(args, kwargs)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 146, in _variable_v1_call
aggregation=aggregation)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 125, in
previous_getter = lambda kwargs: default_variable_creator(None, kwargs)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 2444, in default_variable_creator
expected_shape=expected_shape, import_scope=import_scope)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 187, in call
return super(VariableMetaclass, cls).call(*args, *kwargs)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 1329, in init
constraint=constraint)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 1437, in _init_from_args
initial_value(), name="initial_value", dtype=dtype)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 896, in
shape.as_list(), dtype=dtype, partition_info=partition_info)
File "/home/user/zhangm/videoCompress/OpenDVC/tensorflow_compression/python/layers/parameterizers.py", line 106, in rdft_initializer
init = math_ops.matmul(irdft_matrix, init, transpose_a=True)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py", line 2057, in matmul
a, b, transpose_a=transpose_a, transpose_b=transpose_b, name=name)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/ops/gen_math_ops.py", line 4560, in mat_mul
name=name)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
return func(args, **kwargs)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3274, in create_op
op_def=op_def)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1770, in init
self._traceback = tf_stack.extract_stack()
InternalError (see above for traceback): Blas GEMM launch failed : a.shape=(9, 9), b.shape=(9, 256), m=9, n=256, k=9
[[node MV_analysis/layer_0/signal_conv2d/kernel_rdft/Initializer/MatMul (defined at /home/user/zhangm/videoCompress/OpenDVC/tensorflow_compression/python/layers/parameterizers.py:106) = MatMul[T=DT_FLOAT, _class=["loc:@MV_analysis/layer_0/signal_conv2d/kernel_rdft/Assign"], transpose_a=true, transpose_b=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](MV_analysis/layer_0/signal_conv2d/irdft_3x3, MV_analysis/layer_0/signal_conv2d/kernel_rdft/Initializer/Reshape)]]
For training with 256x256 and batch size = 4, 24GB should be more than enough. I would recommend checking if there are no other jobs occupying your GPUs.
Hello, I can use CPU-based tensorflow for inference and training, but the following error occurs when using Tensorflow-GPU, I have four 24GB 4090s, when I start inference or training, graphics card 0 will suddenly reach its limit, while the other graphics cards are largely unused, the code does not seem to be distributed training. I want to ask how much gpu-memory is needed to train this model?
2024-12-03 16:50:19.337225: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA 2024-12-03 16:50:19.433857: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 0 with properties: name: NVIDIA GeForce RTX 4090 major: 8 minor: 9 memoryClockRate(GHz): 2.52 pciBusID: 0000:18:00.0 totalMemory: 23.65GiB freeMemory: 22.87GiB 2024-12-03 16:50:19.546432: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 1 with properties: name: NVIDIA GeForce RTX 4090 major: 8 minor: 9 memoryClockRate(GHz): 2.52 pciBusID: 0000:3b:00.0 totalMemory: 23.65GiB freeMemory: 23.27GiB 2024-12-03 16:50:19.618871: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 2 with properties: name: NVIDIA GeForce RTX 4090 major: 8 minor: 9 memoryClockRate(GHz): 2.52 pciBusID: 0000:86:00.0 totalMemory: 23.65GiB freeMemory: 23.27GiB 2024-12-03 16:50:19.686616: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1432] Found device 3 with properties: name: NVIDIA GeForce RTX 4090 major: 8 minor: 9 memoryClockRate(GHz): 2.52 pciBusID: 0000:af:00.0 totalMemory: 23.65GiB freeMemory: 23.27GiB 2024-12-03 16:50:19.686669: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1511] Adding visible gpu devices: 0, 1, 2, 3 2024-12-03 16:50:22.192435: I tensorflow/core/common_runtime/gpu/gpu_device.cc:982] Device interconnect StreamExecutor with strength 1 edge matrix: 2024-12-03 16:50:22.192472: I tensorflow/core/common_runtime/gpu/gpu_device.cc:988] 0 1 2 3 2024-12-03 16:50:22.192478: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 0: N N N N 2024-12-03 16:50:22.192482: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 1: N N N N 2024-12-03 16:50:22.192501: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 2: N N N N 2024-12-03 16:50:22.192505: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1001] 3: N N N N 2024-12-03 16:50:22.192657: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 22164 MB memory) -> physical GPU (device: 0, name: NVIDIA GeForce RTX 4090, pci bus id: 0000:18:00.0, compute capability: 8.9) 2024-12-03 16:50:22.192978: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:1 with 22559 MB memory) -> physical GPU (device: 1, name: NVIDIA GeForce RTX 4090, pci bus id: 0000:3b:00.0, compute capability: 8.9) 2024-12-03 16:50:22.193198: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:2 with 22559 MB memory) -> physical GPU (device: 2, name: NVIDIA GeForce RTX 4090, pci bus id: 0000:86:00.0, compute capability: 8.9) 2024-12-03 16:50:22.193420: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:3 with 22559 MB memory) -> physical GPU (device: 3, name: NVIDIA GeForce RTX 4090, pci bus id: 0000:af:00.0, compute capability: 8.9) 2024-12-03 16:50:36.161998: E tensorflow/stream_executor/cuda/cuda_blas.cc:652] failed to run cuBLAS routine cublasSgemm_v2: CUBLAS_STATUS_EXECUTION_FAILED Traceback (most recent call last): File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1334, in _do_call return fn(*args) File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1319, in _run_fn options, feed_dict, fetch_list, target_list, run_metadata) File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.InternalError: Blas GEMM launch failed : a.shape=(9, 9), b.shape=(9, 256), m=9, n=256, k=9 [[{{node MV_analysis/layer_0/signal_conv2d/kernel_rdft/Initializer/MatMul}} = MatMul[T=DT_FLOAT, _class=["loc:@MV_analysis/layer_0/signal_conv2d/kernel_rdft/Assign"], transpose_a=true, transpose_b=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](MV_analysis/layer_0/signal_conv2d/irdft_3x3, MV_analysis/layer_0/signal_conv2d/kernel_rdft/Initializer/Reshape)]]
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "OpenDVC_train_PSNR.py", line 123, in
sess.run(tf.global_variables_initializer())
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 929, in run
run_metadata_ptr)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1152, in _run
feed_dict_tensor, options, run_metadata)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1328, in _do_run
run_metadata)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1348, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InternalError: Blas GEMM launch failed : a.shape=(9, 9), b.shape=(9, 256), m=9, n=256, k=9
[[node MV_analysis/layer_0/signal_conv2d/kernel_rdft/Initializer/MatMul (defined at /home/user/zhangm/videoCompress/OpenDVC/tensorflow_compression/python/layers/parameterizers.py:106) = MatMul[T=DT_FLOAT, _class=["loc:@MV_analysis/layer_0/signal_conv2d/kernel_rdft/Assign"], transpose_a=true, transpose_b=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](MV_analysis/layer_0/signal_conv2d/irdft_3x3, MV_analysis/layer_0/signal_conv2d/kernel_rdft/Initializer/Reshape)]]
Caused by op 'MV_analysis/layer_0/signal_conv2d/kernel_rdft/Initializer/MatMul', defined at: File "OpenDVC_train_PSNR.py", line 49, in
flow_latent = CNN_img.MV_analysis(flow_tensor, args.N, args.M)
File "/home/user/zhangm/videoCompress/OpenDVC/CNN_img.py", line 17, in MV_analysis
tensor = layer(tensor)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/layers/base.py", line 374, in call
outputs = super(Layer, self).call(inputs, *args, kwargs)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py", line 746, in call
self.build(input_shapes)
File "/home/user/zhangm/videoCompress/OpenDVC/tensorflow_compression/python/layers/signal_conv.py", line 411, in build
regularizer=self.kernel_regularizer)
File "/home/user/zhangm/videoCompress/OpenDVC/tensorflow_compression/python/layers/parameterizers.py", line 119, in call
initializer=rdft_initializer, regularizer=regularizer)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py", line 495, in add_variable
return self.add_weight(args, kwargs)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/layers/base.py", line 288, in add_weight
getter=vs.get_variable)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py", line 609, in add_weight
aggregation=aggregation)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/training/checkpointable/base.py", line 639, in _add_variable_with_custom_getter
kwargs_for_getter)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 1487, in get_variable
aggregation=aggregation)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 1237, in get_variable
aggregation=aggregation)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 540, in get_variable
aggregation=aggregation)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 492, in _true_getter
aggregation=aggregation)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 922, in _get_single_variable
aggregation=aggregation)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 183, in call
return cls._variable_v1_call(args, kwargs)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 146, in _variable_v1_call
aggregation=aggregation)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 125, in
previous_getter = lambda kwargs: default_variable_creator(None, kwargs)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 2444, in default_variable_creator
expected_shape=expected_shape, import_scope=import_scope)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 187, in call
return super(VariableMetaclass, cls).call(*args, *kwargs)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 1329, in init
constraint=constraint)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/ops/variables.py", line 1437, in _init_from_args
initial_value(), name="initial_value", dtype=dtype)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/ops/variable_scope.py", line 896, in
shape.as_list(), dtype=dtype, partition_info=partition_info)
File "/home/user/zhangm/videoCompress/OpenDVC/tensorflow_compression/python/layers/parameterizers.py", line 106, in rdft_initializer
init = math_ops.matmul(irdft_matrix, init, transpose_a=True)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py", line 2057, in matmul
a, b, transpose_a=transpose_a, transpose_b=transpose_b, name=name)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/ops/gen_math_ops.py", line 4560, in mat_mul
name=name)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
return func( args, **kwargs)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3274, in create_op
op_def=op_def)
File "/home/user/miniconda3/envs/OpenDVC/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1770, in init
self._traceback = tf_stack.extract_stack()
InternalError (see above for traceback): Blas GEMM launch failed : a.shape=(9, 9), b.shape=(9, 256), m=9, n=256, k=9 [[node MV_analysis/layer_0/signal_conv2d/kernel_rdft/Initializer/MatMul (defined at /home/user/zhangm/videoCompress/OpenDVC/tensorflow_compression/python/layers/parameterizers.py:106) = MatMul[T=DT_FLOAT, _class=["loc:@MV_analysis/layer_0/signal_conv2d/kernel_rdft/Assign"], transpose_a=true, transpose_b=false, _device="/job:localhost/replica:0/task:0/device:GPU:0"](MV_analysis/layer_0/signal_conv2d/irdft_3x3, MV_analysis/layer_0/signal_conv2d/kernel_rdft/Initializer/Reshape)]]