Open PhyllisJi opened 2 months ago
Hi @PhyllisJi -
Thanks for reporting the issue. Here as per the code seems like you are using tensorflow older version(tf2.12.0).
It would be nice if you can use keras3.5.0 and tensorflow2.17.0 latest version with GPU using tf.distribute.MirroredStrategy()
scope and CPU using keras.device('/device:CPU:0'):
. These output from CPU and GPU will give minimum difference.
import tensorflow as tf
import keras
print(tf.__version__) #2.17.0
print(keras.__version__) #3.5.0
import numpy as np
def chebyshev_distance(A: np.ndarray, B: np.ndarray):
if A is None or B is None:
return 0.0
if A.shape != B.shape:
return 9999999
else:
return float(np.max(np.abs(A - B)))
strategy = tf.distribute.MirroredStrategy()
print('Number of devices: {}'.format(strategy.num_replicas_in_sync))
with strategy.scope():
act_layer = tf.keras.layers.Activation(activation='exponential')
inp = np.array([[25.05214, 6.4932823, 5.5203633, 12.618748, 27.186777, 3.7995481]])
input_shape = inp.shape[1:]
act_layer.build(input_shape)
x_gpu = tf.constant(inp, dtype=tf.float32)
output_gpu = act_layer(x_gpu)
print("GPU output:", output_gpu.numpy())
with keras.device('/device:CPU:0'):
act_layer = tf.keras.layers.Activation(activation='exponential')
inp = np.array([[25.05214, 6.4932823, 5.5203633, 12.618748, 27.186777, 3.7995481]])
input_shape = inp.shape[1:]
act_layer.build(input_shape)
x_cpu = tf.constant(inp, dtype=tf.float32)
output_cpu = act_layer(x_cpu)
print("CPU output:", output_gpu.numpy())
output_diff = chebyshev_distance(output_cpu.numpy(), output_gpu.numpy())
print(output_diff)
This issue is stale because it has been open for 14 days with no activity. It will be closed if no further activity occurs. Thank you.
Hi @PhyllisJi -
Thanks for reporting the issue. Here as per the code seems like you are using tensorflow older version(tf2.12.0). It would be nice if you can use keras3.5.0 and tensorflow2.17.0 latest version with GPU using
tf.distribute.MirroredStrategy()
scope and CPU usingkeras.device('/device:CPU:0'):
. These output from CPU and GPU will give minimum difference.import tensorflow as tf import keras print(tf.__version__) #2.17.0 print(keras.__version__) #3.5.0 import numpy as np def chebyshev_distance(A: np.ndarray, B: np.ndarray): if A is None or B is None: return 0.0 if A.shape != B.shape: return 9999999 else: return float(np.max(np.abs(A - B))) strategy = tf.distribute.MirroredStrategy() print('Number of devices: {}'.format(strategy.num_replicas_in_sync)) with strategy.scope(): act_layer = tf.keras.layers.Activation(activation='exponential') inp = np.array([[25.05214, 6.4932823, 5.5203633, 12.618748, 27.186777, 3.7995481]]) input_shape = inp.shape[1:] act_layer.build(input_shape) x_gpu = tf.constant(inp, dtype=tf.float32) output_gpu = act_layer(x_gpu) print("GPU output:", output_gpu.numpy()) with keras.device('/device:CPU:0'): act_layer = tf.keras.layers.Activation(activation='exponential') inp = np.array([[25.05214, 6.4932823, 5.5203633, 12.618748, 27.186777, 3.7995481]]) input_shape = inp.shape[1:] act_layer.build(input_shape) x_cpu = tf.constant(inp, dtype=tf.float32) output_cpu = act_layer(x_cpu) print("CPU output:", output_gpu.numpy()) output_diff = chebyshev_distance(output_cpu.numpy(), output_gpu.numpy()) print(output_diff)
Can you please tell me why tf.2.12.0 doesn't behave quite the same as the new version? Is there some kind of flaw?
Hi @PhyllisJi -
Here tf2.12.0 works with keras2 which has not multi-GPU training functionality with Data parallelism and model parallelism, which is using tf.distribute.mirroredstrategy API. So it is better to use GPU with tf.distribute.mirroredstrategy API on keras3.
Hi @PhyllisJi -
Here tf2.12.0 works with keras2 which has not multi-GPU training functionality with Data parallelism and model parallelism, which is using tf.distribute.mirroredstrategy API. So it is better to use GPU with tf.distribute.mirroredstrategy API on keras3.
But I'm just using one GPU for training, will the results be affected as well?
Issue type
Bug
Have you reproduced the bug with TensorFlow Nightly?
No
Source
binary
TensorFlow version
tf 2.12.0
Custom code
Yes
OS platform and distribution
Ubuntu 20.04
Mobile device
No response
Python version
3.10
Bazel version
No response
GCC/compiler version
No response
CUDA/cuDNN version
No response
GPU model and memory
No response
Current behavior?
Input: [[25.05214, 6.4932823, 5.5203633, 12.618748, 27.186777, 3.7995481]] CPU output: [[7.5858780e+10 6.6068842e+02 2.4972575e+02 3.0217081e+05 6.4130895e+11 4.4680992e+01]] GPU output: [[7.5858780e+10 6.6068842e+02 2.4972574e+02 3.0217081e+05 6.4130888e+11 4.4680988e+01]] Max distance: 65536.0
Standalone code to reproduce the issue
Relevant log output
No response