Closed greatwhiz closed 3 years ago
It went through by downgrading to 2.3.2. However is it possible to fix the code to the latest TensorFlow?
Thanks! We have updated the code to make it work on tensorflow 2.5. (Please tell us if the problem would persist)
I am running it in a free Colab with T4 gpu. It seems that the TF version is 2.5 and I had tried to downgrade it to 2.4.1. Both of the versions result in the same.
The error as below:
Cloning into 'tft-speedup'... remote: Enumerating objects: 16, done. remote: Counting objects: 100% (16/16), done. remote: Compressing objects: 100% (14/14), done. remote: Total 16 (delta 0), reused 13 (delta 0), pack-reused 0 Unpacking objects: 100% (16/16), done. /content/tft-speedup/tft-speedup/tft-speedup 2021-05-25 22:13:33.712671: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0 2021-05-25 22:13:35.803273: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set 2021-05-25 22:13:35.805745: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcuda.so.1 2021-05-25 22:13:35.865853: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2021-05-25 22:13:35.866998: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties: pciBusID: 0000:00:04.0 name: Tesla T4 computeCapability: 7.5 coreClock: 1.59GHz coreCount: 40 deviceMemorySize: 14.75GiB deviceMemoryBandwidth: 298.08GiB/s 2021-05-25 22:13:35.867045: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0 2021-05-25 22:13:35.893599: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11 2021-05-25 22:13:35.893684: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.11 2021-05-25 22:13:35.995240: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10 2021-05-25 22:13:36.094845: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10 2021-05-25 22:13:36.299122: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.10 2021-05-25 22:13:36.390671: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.11 2021-05-25 22:13:36.391146: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8 2021-05-25 22:13:36.391286: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2021-05-25 22:13:36.391943: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2021-05-25 22:13:36.392481: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0 example.py:135: RuntimeWarning: divide by zero encountered in remainder for s_dependency, dependent, i in zip( 2021-05-25 22:13:36.457705: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags. 2021-05-25 22:13:36.457860: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set 2021-05-25 22:13:36.458009: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2021-05-25 22:13:36.458660: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties: pciBusID: 0000:00:04.0 name: Tesla T4 computeCapability: 7.5 coreClock: 1.59GHz coreCount: 40 deviceMemorySize: 14.75GiB deviceMemoryBandwidth: 298.08GiB/s 2021-05-25 22:13:36.458703: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0 2021-05-25 22:13:36.458752: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublas.so.11 2021-05-25 22:13:36.458776: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcublasLt.so.11 2021-05-25 22:13:36.458795: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcufft.so.10 2021-05-25 22:13:36.458816: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcurand.so.10 2021-05-25 22:13:36.458849: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusolver.so.10 2021-05-25 22:13:36.458867: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcusparse.so.11 2021-05-25 22:13:36.458886: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudnn.so.8 2021-05-25 22:13:36.458961: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2021-05-25 22:13:36.459589: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2021-05-25 22:13:36.460124: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0 2021-05-25 22:13:36.460187: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0 2021-05-25 22:13:37.208926: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1261] Device interconnect StreamExecutor with strength 1 edge matrix: 2021-05-25 22:13:37.208994: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1267] 0 2021-05-25 22:13:37.209010: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1280] 0: N 2021-05-25 22:13:37.209205: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2021-05-25 22:13:37.209931: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2021-05-25 22:13:37.210502: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:941] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2021-05-25 22:13:37.211023: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1406] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 13968 MB memory) -> physical GPU (device: 0, name: Tesla T4, pci bus id: 0000:00:04.0, compute capability: 7.5) Traceback (most recent call last): File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/util/dispatch.py", line 201, in wrapper return target(*args, *kwargs) File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/array_ops.py", line 1677, in concat return gen_array_ops.concat_v2(values=values, axis=axis, name=name) File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/gen_array_ops.py", line 1198, in concat_v2 values, axis, name=name, ctx=_ctx) File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/gen_array_ops.py", line 1228, in concat_v2_eager_fallback _attr_T, values = _execute.args_to_matching_eager(list(values), ctx, []) File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/eager/execute.py", line 274, in args_to_matching_eager t, dtype, preferred_dtype=default_dtype, ctx=ctx) File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/profiler/trace.py", line 163, in wrapped return func(args, kwargs) File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/ops.py", line 1540, in convert_to_tensor ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref) File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/constant_op.py", line 339, in _constant_tensor_conversion_function return constant(v, dtype=dtype, name=name) File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/constant_op.py", line 265, in constant allow_broadcast=True) File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/constant_op.py", line 276, in _constant_impl return _constant_eager_impl(ctx, value, dtype, shape, verify_shape) File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/constant_op.py", line 301, in _constant_eager_impl t = convert_to_eager_tensor(value, ctx, dtype) File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/constant_op.py", line 98, in convert_to_eager_tensor return ops.EagerTensor(value, ctx.device_name, dtype) File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/keras_tensor.py", line 274, in array 'Cannot convert a symbolic Keras input/output to a numpy array. ' TypeError: Cannot convert a symbolic Keras input/output to a numpy array. This error may indicate that you're trying to pass a symbolic value to a NumPy call, which is not supported. Or, you may be trying to pass Keras symbolic inputs/outputs to a TF API that does not register dispatching, preventing Keras from automatically converting the API call to a lambda layer in the Functional Model.**
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/ops.py", line 1853, in _create_c_op c_op = pywrap_tf_session.TF_FinishOperation(op_desc) tensorflow.python.framework.errors_impl.InvalidArgumentError: Fill dimensions must be >= 0 for '{{node ones}} = Fill[T=DT_FLOAT, index_type=DT_INT32](tf.concat_6/concat, ones/Const)' with input shapes: [3], [] and with input tensors computed as partial shapes: input[0] = [?,100,5].
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "example.py", line 262, in
run_simple_experiment()
File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 829, in call
return self.main(args, kwargs)
File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, ctx.params)
File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 610, in invoke
return callback(args, kwargs)
File "example.py", line 35, in run_simple_experiment
results_v = simple_experiment("vectorized")
File "example.py", line 219, in simple_experiment
model = tft_model.get_model_vectorized(model_capable_vectorize=True, single_sequence=True)
File "/content/tft-speedup/tft-speedup/tft-speedup/tft_model.py", line 904, in get_model_vectorized
historical_windowed, future_windowed, static_emb, batch_dimensions=2
File "/content/tft-speedup/tft-speedup/tft-speedup/tft_model.py", line 638, in build_base_tft_graph
get_lstm(return_state=False), batch_dimensions, historical_features
File "/content/tft-speedup/tft-speedup/tft-speedup/tf_utils.py", line 34, in timedistributed_over_more_batch_dimensions
seq_squashed, batch_shape_orig = squash_batch_dimensions(seq, batch_dims)
File "/content/tft-speedup/tft-speedup/tft-speedup/tf_utils.py", line 86, in squash_batch_dimensions
new_shape = tf.concat([[-1], retain_shape], axis=-1)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/util/dispatch.py", line 205, in wrapper
result = dispatch(wrapper, args, kwargs)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/util/dispatch.py", line 122, in dispatch
result = dispatcher.handle(op, args, kwargs)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/layers/core.py", line 1450, in handle
return TFOpLambda(op)(*args, *kwargs)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 952, in call
input_list)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 1091, in _functional_construction_call
inputs, input_masks, args, kwargs)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 822, in _keras_tensor_symbolic_call
return self._infer_output_signature(inputs, args, kwargs, input_masks)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 869, in _infer_output_signature
keras_tensor.keras_tensor_from_tensor, outputs)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/util/nest.py", line 659, in map_structure
structure[0], [func(x) for x in entries],
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/util/nest.py", line 659, in
structure[0], [func(x) for x in entries],
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/keras_tensor.py", line 606, in keras_tensor_from_tensor
out = keras_tensor_cls.from_tensor(tensor)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/keras_tensor.py", line 193, in from_tensor
inferred_value = array_ops.ones(shape=tensor).shape
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/util/dispatch.py", line 201, in wrapper
return target(args, kwargs)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/array_ops.py", line 3132, in ones
output = fill(shape, constant(one, dtype=dtype), name=name)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/util/dispatch.py", line 201, in wrapper
return target(*args, kwargs)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/array_ops.py", line 239, in fill
result = gen_array_ops.fill(dims, value, name=name)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/gen_array_ops.py", line 3358, in fill
"Fill", dims=dims, value=value, name=name)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 750, in _apply_op_helper
attrs=attr_protos, op_def=op_def)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/func_graph.py", line 592, in _create_op_internal
compute_device)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/ops.py", line 3536, in _create_op_internal
op_def=op_def)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/ops.py", line 2016, in init
control_input_ops, op_def)
File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/ops.py", line 1856, in _create_c_op
raise ValueError(str(e))
ValueError: Fill dimensions must be >= 0 for '{{node ones}} = Fill[T=DT_FLOAT, index_type=DT_INT32](tf.concat_6/concat, ones/Const)' with input shapes: [3], [] and with input tensors computed as partial shapes: input[0] = [?,100,5].**