keras-team / keras

Deep Learning for humans
http://keras.io/
Apache License 2.0
61.7k stars 19.43k forks source link

When using exponential as the activation function, the outputs of the CPU and GPU have large differences #20310

Open PhyllisJi opened 2 days ago

PhyllisJi commented 2 days ago

Issue type

Bug

Have you reproduced the bug with TensorFlow Nightly?

No

Source

binary

TensorFlow version

tf 2.12.0

Custom code

Yes

OS platform and distribution

Ubuntu 20.04

Mobile device

No response

Python version

3.10

Bazel version

No response

GCC/compiler version

No response

CUDA/cuDNN version

No response

GPU model and memory

No response

Current behavior?

Input: [[25.05214, 6.4932823, 5.5203633, 12.618748, 27.186777, 3.7995481]] CPU output: [[7.5858780e+10 6.6068842e+02 2.4972575e+02 3.0217081e+05 6.4130895e+11 4.4680992e+01]] GPU output: [[7.5858780e+10 6.6068842e+02 2.4972574e+02 3.0217081e+05 6.4130888e+11 4.4680988e+01]] Max distance: 65536.0

Standalone code to reproduce the issue

import tensorflow as tf
import numpy as np

tf.random.set_seed(42)
tf.config.experimental.enable_op_determinism()

def chebyshev_distance(A: np.ndarray, B: np.ndarray):
    if A is None or B is None:
        return 0.0
    if A.shape != B.shape:
        return 9999999
    else:
        return float(np.max(np.abs(A - B)))

act_layer = tf.keras.layers.Activation(activation='exponential')
inp = np.array([[25.05214, 6.4932823, 5.5203633, 12.618748, 27.186777, 3.7995481]])
input_shape = inp.shape[1:]
act_layer.build(input_shape)

with tf.device('/CPU:0'):
    x_cpu = tf.constant(inp, dtype=tf.float32)
    output_cpu = act_layer(x_cpu)
    print("CPU output:", output_cpu.numpy())

if tf.config.list_physical_devices('GPU'):
    with tf.device('/GPU:0'):
        x_gpu = tf.constant(inp, dtype=tf.float32)
        output_gpu = act_layer(x_gpu)
        print("GPU output:", output_gpu.numpy())
else:
    print("GPU not available.")

output_diff = chebyshev_distance(output_cpu.numpy(), output_gpu.numpy())
print(output_diff)

Relevant log output

No response

mehtamansi29 commented 2 days ago

Hi @PhyllisJi -

Thanks for reporting the issue. Here as per the code seems like you are using tensorflow older version(tf2.12.0). It would be nice if you can use keras3.5.0 and tensorflow2.17.0 latest version with GPU using tf.distribute.MirroredStrategy()scope and CPU using keras.device('/device:CPU:0'):. These output from CPU and GPU will give minimum difference.

import tensorflow as tf
import keras
print(tf.__version__)         #2.17.0
print(keras.__version__)      #3.5.0
import numpy as np

def chebyshev_distance(A: np.ndarray, B: np.ndarray):
    if A is None or B is None:
        return 0.0
    if A.shape != B.shape:
        return 9999999
    else:
        return float(np.max(np.abs(A - B)))

strategy = tf.distribute.MirroredStrategy()
print('Number of devices: {}'.format(strategy.num_replicas_in_sync))

with strategy.scope():
    act_layer = tf.keras.layers.Activation(activation='exponential')
    inp = np.array([[25.05214, 6.4932823, 5.5203633, 12.618748, 27.186777, 3.7995481]])
    input_shape = inp.shape[1:]
    act_layer.build(input_shape)

    x_gpu = tf.constant(inp, dtype=tf.float32)
    output_gpu = act_layer(x_gpu)
    print("GPU output:", output_gpu.numpy())

with keras.device('/device:CPU:0'):
    act_layer = tf.keras.layers.Activation(activation='exponential')
    inp = np.array([[25.05214, 6.4932823, 5.5203633, 12.618748, 27.186777, 3.7995481]])
    input_shape = inp.shape[1:]
    act_layer.build(input_shape)
    x_cpu = tf.constant(inp, dtype=tf.float32)
    output_cpu = act_layer(x_cpu)
    print("CPU output:", output_gpu.numpy())
output_diff = chebyshev_distance(output_cpu.numpy(), output_gpu.numpy())
print(output_diff)